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ABSTRACT . _ . 

intended for teachers arid admihistrators who wish to 
assess student oral communication, needs before desigriirig an 
appropriate program, this guide provides a review of tests for 
measuring acts of speaking and listening. The guide surveys and 

discusses procedures for assessing speaking and listening skills 

among school children and focuses on technical issues of measurement 
and pragmatic questions of administrative feasibility. The first 
section provides a review and critique oE procedures for assessing 
oral communication skills. The second section reviews 45 oral 
communication assessment instruments, including the Calif ornia^ 
Achievement Test for Listening, Comprehensive Test of Basic Skills, 
Metropoiitari Achievement Tests, and the National Assessment of 
Educational Progress Pilot Test of Speaking and Listening. Appendixes 
cbntain standards for effective oral communication programs, and 
criteria for evaluating instruments and procedures for assessing 
speaking and listening. (HTH) 
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Preface 



Because ii jvalualiuh ol Oral conimunicaiiori is an integral part of the education process; the 
Speech Communication Association is publishing this review of tests for measuring acts of 
speaking aiul listening. Teachers and administrators; desiring to undertake that all-important 
first step of assessing student needs before a program is designed, ask, •'Which test should I 
^^^'^ ^^"-^ V^f^^'^^^^^'^ chrccmn us lo svhm chokes arc cnwlahlc. Since stales arid school 
districts vary in their program goals f()r speaking arid lisicriirig, the choice of tests will also 
vary. 

Reali/.ing educators might warit a professional evaluation of currently available tc^ts, a special 
C^ihihuttce ori Assessment and Instrument Development was appointed by the Speech Com- 
munication Association's Committee on Assessment and Testing: Don Rubin and Nancy Mead 
were major participants on that committee: their thorough work led to this publication. 

Whether a state evaluation is taking place or a teacher wants a test for the classroom, this 
review should be helpful. Test validity and reliability are coriiriiori coricerns lor the educator 
constructing tests, ecjually inlportarit are quesiioris of cost, tihie. and scoring procedures. The 
forty-five tests reviewed iri Chapter II iriclude each of these concerns, as well as a fuil description 
of the assessriierit irislrument. To help in the search for comparable tests, the age range and 
skills tested are listed first. Also included is a section on evaluative comments; These profes- 
sional judgments should provide the reader with information necessary to make good choices. 

Testing is a value laden subject: For some; all testing is loathsome arid antithetical to the 
spirit of learning: Others think testing is burdensoriie but riece.ssary for providirig feedback 
about educational outcomes. Yet another pbirit of view regards evaluation opportunities — a.s a 
pinverful tool \or shaping cUrricUlUrii or mariipulatirig educational policy. 

This review surveys arid discusses procedures for assessing speaking and listening skills 
ariiorig public school aged children, in doing so. it focuses on technical issues of tnca.suremcnt 
and pragmatic questions of administrative feasibility. But the materials also emphasi/e the 
responsibility in testing, and users of this document must not lose sight of the purposes of oral 
communication assessment and weigh the variety of costs against the potential valUe of test 
results: For communication does n()t thrive in ari exclusively evaluative clirriale. Assessment 
is beneficial only t() the degree that it yields iriterpretable results— data that arc directed ibward 
s()Iving educatiorial problems, data that renect communication skills and behaviors that are 
ceritral to effective functibning; not merely data that is readily measurable. In this light, the 
task of evaluating, selecting, or developing appropriate measurement instruments demands that 
educators render decisions on a.s informed a basis a.s possible: It is our hope that this information 
will facilitate such decisions, and thus contribute to responsible evaluation. 

In any testing situation the user must know what is and what is riot being tested. Iri ariy 
testing situati()n the user must kh()W how to coriiriiUriicate the results of the test wlfh the 
<///(;/;7/( (///>VM' in iri the testing sitUatibri. The purpbse of the information provided in this 

guide is t(> aid iri (hat ctmiriiUriicatibri. 




Foreword 



The Hdlicati()hul Resources Ihlbrmaiioh Center (ERIC) is a naiional int'orniation system de- 
veloped by the U.S. OITicc of Education and now sponsored by the National Institute of 
Education (NlEj. It provides ready access to descriptions of exemplary prdgrarils, research arid 
development eti'orts. and rciaicd inlormation useful in developing more effective educational 
programs; 

Through its network ()f specMaliml centers or clearinghouses, each of which is responsible 
for a particular educational area, ERIC acquires, evaluates, abstracts, and indexes current 
^i|^n"^^^^nLi"l^>niialion and lists this im'orniation in its reference pablications: 

HR'iC/RCS, the ERIC Clearinghouse on Reading and Communication Skills, disseminates 
educational informahon related U) research, instruction, and personnel preparation at all levels 
and in all institutions; The scope of interest of the Clearinghouse includes relevarit research 
reports: literature reviews, curriculum guides and dcscripiibris, coriference papers, project or 
program reviews, and other print niaterials related \o all aspects of reading, English, educalionai 
journalisni, arid speech commuriicaliori. 

The ERIC system has already made available— through the ERIC Document R ^piv)dUctiori 
System— nuich informative data; However, if the findings of specific cducatiorial researcli aie 
to be intelligible to teachers and applicable to teachirig, cqrisiderabre bodies of data must be 
rccvaiualed, focused, translated, arid molded irilb ari esserilially different context. Rather than 
resting at the p()irit of riiakirig research reports readily accessible, NIE has directed the separate 
clearirighoUses jo work with professional organizations in developing information analysis 
papers in speiL^fic areas within the scope of the clearinghouses; 

ERIC is pleased to cooperate with the Speech Communication Association in making Large 
Scale Asscssffwni of Orui Cimummicamm Skiifs: /r//u/(v;i,'a77(v; ihrouiih Grade 12 available. 

Charles Suhor 
Director, ERIC/RCS 
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I. A Review and Critique of Procedures for 
Assessing Oral Comrnunication Skills 



WHY TKACH ORAL COMMUNICATION SKILLS?' 

A wcii-kiiDwn ndugc has it t/ai ol" all the creaitircs inlmbiting the earth, iish arc the least iikeiy 
to ever discover water. Hi^^ is with annmunication. Spe'eeh cumes to us as part of our innate 
enclovvnient as human beings^„Wc are eriguli'ed by comhuinjcatioh in all our daily affairs. 
Usually \ve arc hot directly aware of our oral coiiinuinicatibri chvirorimerit. But it is nonetheless 
vital to (Hir well-heihg and survival. 

Speaking and listening are prerequisites io success in schooL Most instructions for classrooni 
prt)ccdures are delivered oraily by teachers: Gonscqacntiy, students with deficient iistcning 
skills fail to absorb much of the material to which they arc exposed. Their problcili.s are 
intensified when they respond ineorrectly because they do not listen to questi()hs carefully. 
Students who listen poorly are often isolated and left out of classrooni activities. Speech 
;xTformarice als() affects academic achieverrierit. Students who eanridt adequately express their 
kriowledge are ludged ignorant. Some speech styles trigger stereotyped expectations of poor 
ability — expectations that are likely to be self-fujni!ing (Williams, Whitehead, and Miilc.r« 
i^)72). Quiet children may be appreciated for their "good behavior,'' but they are subject to 
siniiiarly negative school experiences (IVicCroskey and Daly, 1976); Students who do not ask 
for assistance will not receive adequate assistance. One research study, for example, found that 
reticent students progressed slowly through a self-paced reading prograni, despite riorriial levels 
of reading uptitUde. The reason for their pb()r performariee was that these students rarely 
apprt)aehed teachers for help (Sebtt, Yates, and Wheeless, 1975). 

Beyond the confines of school, oral communication proficiency contributes to social ad- 
justment and satisfying interpersonal relationships; Youngsters with poor communication skiiis 
are sometimes viewed as unattractive by their peers and enjoy few friendship bonds (Hurt and 
Preiss. 1 978). Antisocial and violent behavior is frequently attributable to underdeveloped social 
sensitivity and lack of contliet resolution techniques. Remediation programs have reduced the 
incidence of antisi^cial acts by hieans of commuriicatiori training (Chandler, 1973). Cduriselors 
acknowledge that many family problems are caused by poor corrimunicatibri, and may be 
ameliorated by improving interaction between family members (Shure and Spivack, 1978). 

Speaking and listening are no less crucial in the marketplace. Communication skills rank 
high among lists of managerial competencies. kr\ officer of one computer firm, for example, 
states that the company prefers to conduct its own training in computer programming, but seeks 
employees with strong comniunication abilities (Gruner, Logue, Frcshlcy, and Husefnan, 1977). 
Professionajs — doctors, lawyers, engineers, teachers— require more than just subject rriatier 
expertise. These profcssibrials rriijst listen effectively to their jiatjerits, clients, or students in 
order to identify arid arialyze problems. They must speak effectively in order to rinplemerit 
their solutions. Individuals who speak in a nonstandard fashion (DeLaZerda and Hopper. 1978) 



'An earlier version of this section appeared in b. Rubi.n and R_. Bazzlc, Developmetu of an OraLCom- 
ynimicuVum Assi\ssynVnt Vw^rani: TlYe Clytni Coiouy Speech Profu teney K.\anwuyil(yn ^ ^ff^h School 
Suidems (Brunswick: Georgia: Glynn Coaniy School System: 1981); The authors express their apprc- 
eiaiiori to the Cilynri Cojrity School System for use of this matt?rial. 
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OY who withdraw Ironi speaking (Richniohtl. \^)ll) ic(\J U) be regarded by personnel otYiccrs 
lis prospects lor only low stattts: low paying johs. Nonetheless, even Unskilled workers have 
bccasioh to engage in job related speech; including a surprising amoiihl ol' pUhlie speaking 

(Kendall, Wli).^ _ ' - .. 

Sj>eech cLirrieula have tfaditionally stressed the importance dI' comniunicaiion tor the pres- 
. ervatton of a democratic society. Throughout its history; America has_I'ought vigorously to 
" sal'eguard rreedDm of expression uiiJer the assuhpion that full citizen participation is the surest 
guarantee against tyranny. Surely not every eiti/en (leliberates as a iiiember of a legislative 
body, but numerous opportunities lor citi/.en input are available. These include participation 
in Jivic associations, public hearings; and citizen lobbying, Al the very !easl. citizens are 
responsible lor staying inl'ormed. and much of the pertinent inlornnition must be culled by 

listening. , 

Finally, oral communication is essential to lull psychological development: Scil'-concept |s 
acquired through interaction with others (Mead. \ Wl Scir-actiiali/.ation. a sense olTultlllmcnt 
(Maslow. 1^)54); usuaily involves interpersonal aciiviiies^makirig contributions, exerting in- 
Huence. i)r being recognized In a social manner. In addition, speech is a riieahs lor artistic 
cxprcssioil an(l si'lt'-disctnery: 

The tact that all students conic t() school with basic speaking and listening skills and also 
seem to develop more mature behaviors on their own as they grow older does ni)i imply that 
aii students are (iJlrnvc comnuihieators. Hducat()rs occasionally comment; ; My students don't 
need to learn how to talk. That'., one thing they do too mUch of.** But cffeetive communication 
must be cultivated: Students may luck clarity in their speech. Their li.stening comprehension 
may not attain its fullest potential: Students who communicate well in familiar .settings may 
lack the eorifidenee and llcxibility needed to express themselves efteetively iri a wider range 
of situations. HdUeators cannot rely on haphazard, anguidcd learning (HJtside of the classroom 
to impart communication clTcctivehcss. Systematic instruction is imperative. 

Still. t)f all the basic skills, speaking arid listening are most often neglected in S(;hools. This 
neglect transpires despite numerous curriculum doeUriierits that urge attention to oral abilities^ 
Un^doubtediy; a host of factors discourage teaeherN from ihipremcriting oral communicatton 
instruction. Teachers are held accountable for students* reading achicvemcrit. for pcrformanee 
orijnaridaied grammar tasks, for monitoring attendanL\r. tor giving enough homework, for not 
giving too much homework. But teachers are generally not held aecountjble tbrteaehirig .students 
to speak and listen effectively. Furthermore, lew tc.:hcrs have received training in cbmnui- 
nication education or have materials available to aid instruction; Consequently, little concerted 
instruction in speech communication takes place. ._ ^ 

jf students* speaking and listening proficiency were syslemat^eaiiy evaluated; it is hkeiy that 
schools would systematically implement oral communicatibri irisirUetiori. One substantial benefit 
of large scale as.sessment of oral communication skills is that such testirig cari guide innovation 
in t..is curriculum domain. Indeed; experience in Great Britain and el.sewhere demoristrates 
that speech as.sessmerit has a ''washback** effect on the amount and kinds of speech teaching 
undertaken in classroom*^ (Barries. 1980). 

Anoth'tr benefit of oral commUnicatior. assessment is that test results can be used to make 
decisions about the best manner in which to place individual students in instructionai sequences. 
Assessment procedures that yield fine-grained analyses, rather thari global judgments; can be 
Used for diagn()stic purposes (Rubin: 1981). Thus, for example, studerits wh() have difficulty 
in vocal production factors might concentrate on oral reading, while thb.se whose difficulties 
lie in the area of organization might cycle through a set of story-telling exercises belbre 
progressing to explanatory discourse. Students who demonstrate strengths in. say. hteral com- 
prehension of spoken materials might advance to instructional units emphasizing critical listening 
skills; 
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SpcuRiht: ;inil lisiciiiiiu icsis caii ;ilsti pK^vidc valuable iriiormutiori for prograiii evaluation. 
Siiicc large scale piDgrams lil" oral coinriuiriicaiiori improvemeni are In their iniancy; it is 
L^pccially impDrtani Id evaluate their ci'i'cctivcricss and to secure data thai will enable these 
programs to he "line luiicd:'' Program (and teacher) etTeciivencss is best judged with reiercnce 
10 sitidcni achievement m program objectives. li' students are not achieving criterion perlbr- 
iiiaiico levels in language use. \h\ example, teachers and adhiiriistrators \vill recogrii/e that 
addiliohal ihslructional eirort heeds l() be direcicd io this area. It is worth noting, however, 
thai stLideril achieveiiieiii can be ihicrpreied as an indicator of program success (inly when 
stiijeht aptitude and institutional resources — the raw iiialeriais with which ihe program has io 
work are also taken into account. Also; student achievement is not the only data that might 
contribute to program evalaaiion: Attitudinal (uitcomes, self-, and peer-evaluations are als() 
usclul iiiibrmation lor thi^ purp()se. After emphasis on listening, the listening scores might mi 
improve as much as teacher reports on better rcsp()hses t() directions oh assighmchts. 

A rihal use lor speaking arid lisiehirig assessmerijs is to certify students as having attained 
(or hot attained) mastery in oral communication. Ceriificalion in basic skills is increasingly 
demanded by conijielcncy-hased education movements. Promotion or graduation decisions may 
be based upon such cerlii'ication. Backlund and his associates (1981) surveyed many of the 
stale and local jurisdictions that have already adopted large scale tests of speaking and listening 
skills as a means ol certifying students' cornpetencies in the basic skills. 



BACKGROUND AND OBJECTIVES 

Several devclbpmcrils iri the past few years mtilivaied the test review effort represeriicd in this 
/ publication. Set in the background of a halionwide movcriient toward cdinpeiericy-based ed- 
ucation, speech communication educators recognized that their field was not exempt from the 
challenge of acct)unting for educational outcomes (Ritter: 1978): A major step in facilitating 
that accountability was the publication of the results of the National Speech Communication 
Competencies Pr()ject (Allen and Brown, 1976). This document examined the development of 
c()mmUhicati()h skills and promulgated a dcscripticih of fUhctiohal comfilUhicatioh competehce 
ihal has helped guide many subsecjuehl efforts iri cUrricUlijrri, iristrijclibri, arid evaluatiori. A 
list of competencies in speaking arid lisieriing for high school graduates (Basseti, Whiilirigton, 
and Slalon-Spiccr. 1978) has likewise intluenced teaching and assessment. In their "Standards 
ibr Effective ()ral Ct)mmunication Programs" (sec Appendix A), the American Speech- 
Language-Hearing Association and the Speech Communication Association asserted that ef- 
fective instructi(mal efforts must include provisions for appropriaie and con?-tructive methods 
of assessment arid evaluation. Such methods were further clarified by "Criteria for Hvaluatihg - 
^Irislrumcrils arid Procedures for Assessirig Speakirig arid Listeriirig'' (see Apperidix B), a doc- 
ument eridorsed by the ')pcech Commuriicaliori Assoc ialiori. 

l>£spite this initial impetus to evaluate communication competencies, despite the view that 
(levci')ping assessment procedure^ presents no insurmountable technical obstacles (Larson, 
i97S: IVicOlone, 1973): and despite some concrete suggestions of pertinent measurement in- 
strunicnts (Iji.on, Backluiul, Redmcnid: and Barbour, 1978: McCuleb, I97^>\ McCaleb and 
Korman. 1978). attempts to itnplement large scale assessments of speaking and listening skills 
have ri()t bee ri torthcohiirig. Iri gerierlil, evaluatiori programs have beeri siyriiied by a scarcity 
()f suitable instrUhierils (Browri. Backlund, Gurry, arid Jaridt, 1979; Plattor, Uriruh, Muir. arid 
^fsiu)se, 1978). Consequently, the Steering Commiltee of the Task Force on Assessment and 
Testing of the Speech Communication Association acted to establish committees for the purpose 
of identifying existing instruments and furthering the development of additional instruments 
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ior ilic liKiisiirchiclit ol conuiiiUiiciitiiMi skills; this moni)gi:ipli is an iMitiirowth of one sucli 
coniniiitcc^ 

l-i)iir priinaiy ohjcciivcs uiiiilcil this clTort; 

• To hionilor existing assessniciii itisiaiincnts in oral eomhuinication. 

• To absiraet aiul ilescribe assesstiKhl instrunienis and systcmaticaliy rcpDti Mieir availahility 
to the Task Foree ahil the Associatiiih. . . . . _ 

• to study and rea»mmeiul researeh iliat is heeUecl to uevcliip listening antl >pcaking as- 
sessment instruments lor elementary and secondary sclint)ls, 

• T() enciHiragc development of new instruments by eoniniereial and hoheohiiiiereial siuirces . 

In order to delimit the scope of the task and to set priorities, (he conimittec dciuied its i\KUs 
Uirther: Particular attention was paid to assessment instrumchts that evaluate communication 
hehaviiirs. as opposed to instruments that describe behaviors but assign no ;iudgmentsorciua!ity> 
In addition, emphasis was placed on measures oi' communication perse {verba! and nonverbal 
encoding aiul decoding in situations rangine iVom high liiieraction to extended and uninterrupted 
di'.coursel. rather than on measures that I'ocused csciusively on component subskills like lan- 
niiage, role-laking. articulation, or perceptual acuity. Hie etl'ort was direciec! towaal instruments 
that had the measurement of communication as their iriain purpose, not those that used com- 
nunueation Incidentally as a means to measure other skills. This latter einpliasis did not nec- 
essarily exclude indirect measures of communication compcicncy. but it severely constrained 
the types of indirect measures that nvght be lound suitable. The search was narrowed to 
assessment procedures that seemed amenable to large scidejest^ing in institiitional/schin^^ 
lings. Finally, attempts were made to include instruments appropnatejbr a variety of individuals 
including nt^vnative speakers, iniriority culture children, and students with special needs. 

PROCEDURES FOR REVIEWING ASSESSMENT INSTRUMENTS 

Given the objectives and emphases described in the preceding section, a number potential 
sources of existing irislrumerits were searched. Letters were sent to majin- commercial test 
publishers. Previouslv published comperidiUms of communication measures ((-arson. et aL. 

Brown, et al., 1979: Piattor. et al.. 1978) were consulted.. as \vere more general [ists of 
tests and evaluation instruments (Fagan. Cooper, and Jensen. 1975: Buros, 1978:\johnson: 
|97fi: Groijimon. 1976): The literature on second language testing was also a valuable source 
of informanon (e g.. Lange and Cliftbrd: 'l98(): Richard. 1981). The assessment commhtcc 
coHected a hUhiber of evaluation procedures produced by state and local education agericie.s. 
In addition, an KRIC search was conducted. Calls tbr assessment instruments were published 
in SPKCllRA and in the Nnvsicner of the Niittonai C<)}fcrcnvi\{vr Rvsvwvlun ilti^ish. Finally, 
individual committee members contributed to the data base by examining literature in their 
areas of expertise. .. ._ 

A catalogue of Instruments that met the criteria of the a.s.sessmcnt committee is presented in 
Table 1. Bach instrument was assigned non.systematically to a single comhilttcc member for 
review. These reviews appear in Chapter 2 of this publication. The coritcrits of the reviews 
rcllect the views of the individual reviewers as inilueneed by their expert judgment. _ 

The lorm used for the ihstrumcnt reviews presented in this book is primarily descriptive. 



-Other committee members who pri)vidcd input to this "Sport ;u"e J. Dai>' (Dhiversiiy of Texas). W. P. 
DIeksnn (tJnlvcrsiiy of Wiseimsin), and J. MeCrnskey (West Virginia Univcrsily); I heir contributions 
to this ciTort arc gratefully ackno\^ledged. 
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lisiCiiini!; visual 
dccodiii^: audi- 
tory discrinii- 
nation 



oral lani;uav!e: lis 
ienin^; social 
deveU>pitieiiti 
audiit)ry per- 
ceplii)!! 



Taicvt 
i'ttpuiations 



liiuli ^■/luH>l. adiili 



priniarv 



lisieiiing 



spcaKini: 



uradcs k 3 



pre-k lo 3 



sccoiid;ir\ 



earlv eleinetitarv 



inliincv 5 vears 



tiraitles 1 - ^> 



elciticritarN 



Mode ol 
Adminis tration 

adiiiitiislered inaiiv 
and coiijpleied on 
staiuiardi/cd tonus 

group adniiiiisiered: 
nuiltiple clioicc. 
paper anti pencil 
lorniat 

uroiip adiiiihisicied 
nuiliiple-vhoice 

ioi HKll 

individual, oral; cliitd 
responds to a vari- 
ei\ ot siriniili 

viroiip adiiiiliisieied; 
iill in the blank, 
pafKT and pencil 
lOfnlat; t;ipe re- 
corded siinuiius 

jiroup adniisiered; 
paper arid peiicil. 
multiple-choice 
•tormai 



observer records 
presence or ab- 
sence ol skills oil 
basis of extended 
obscrvaiiori 



group adiiiinisiered: 
iiliiliiplc-clioice. 
paper and pencil 
roriiiai 



adiiiihisicred It) pairs" 
ot siudenis. one 
presents ta^/k. oilier 
re s pond 

responses tape re- 
ciii^od 



5 

12 



TABLE 1 (coMl 



InMnjnicnl 
Number 


Tillc 


Source 


Skills 
Tested 


Target 
Populatioiis 


Mode of 
Adhiinistraiion 














K) 


DYCOMM: 
Dyadic Coni- 
ttiunieunt^n 


B.H, Byers. DY- 
COMM: 
Dyadic G>- 
mumcalion. 
Honolulu; t'ni- 
versilv ot" Ha- 
waii. 


speiikihg; listen- 
ing; interaction 


adaptable k- 12 


groUj:*< of 10 or more 
studer.is woj-k in 
pairs rotating 
among partners 
arid tasks 


1 1 


j-'ulicrtoiiLan- 


Consulting Psy- 


l.sienihg; audiKiry 


1 1~1S years; 


individual administra- 




jiiiagc Test lor 
A dole sec Ills 


chologisls 
Press. Palo 
Alto. CA 


processing 


learning dis- 
abled arid rion- 
disabled 


tioti 


12 


I'Uiulaincntal 
Acltievcnieni 


The Psychological 
CoTpofatipn. 
757 Third Ave- 
nue: New 
York^ SIV 
11)017 


listening; recep- 
tive language 


grades ft- 1 2 


group adiiiihistered';-^ 
maliipic -choice. \ 
paper and pencil 
t'oriiiat: taped in- 
struction 










grade 10 


individual speech 
pet'torhiarice 
addressed to 
examiner 


13 


(iaiy. jiuiiana 
Oral Proti- 
cieney lixanii- 
naiion 


(lary C onimuhiiy 
Schotil Corpo- 
raiion:_Gary. 
IN 46401 


speaking 


li 


Cilynn County 
Speech Proti- 
ciency [;xanii- 
nation 


CBl: Deiiuirtslra- 
lion Pri)ject. 
Glyhri County. 
Board of Edu- 
cation. Bruns- 
wick. GA 

3ir»2i 


speaking 


secondary 


simulated public 
hearing; students 
presenting argu- 
iilehts one at a 
tiiiie: responses 
videotaped 


15 


l.ahviiiage Assess- 
n)enl Scales 


l:inguaincirics 
Group, P.O. 


speaking: listen- 
ing 


grades 1-5: 
Spanish or 


iiiiiltiple-chi)ice re- 
sponses to oral 




Box 454. Cone 
Madera. CA 

y4y25 

r 




Uiiglish 


preseritatiohs: oral 
imitation of sounds 
and words 


16 


l:anguage Domi- 
nance Survey 


Multilingual Ceil' 
ler. Berkeley. 
Calitornia 


speaRirig; listen- 
ing 


grades k - 12: 
■Spanish. 
Hriglish 


individual administra- 
tion 


17 


Language l-acili(y 


The Allington 


speaking 


ages 3- 15 tor 


individually adniihis- 
tered; free re- 




Tesi 


Corporation. 




n.jrnial popula- 




KOI N. Pin St:. 
Alexandria. 
VA 22314 




tions 


sponses to picture 
stintali 



\ 



6 



13 



TABLK I (cont.) 



hisinimcni 
Nuiiiber 


Tiile 


Si)uree 


Skills 
Tested 


Taruet 


Mode ot 


Pi)pulations 


Adniinisiralion 






18 


l.aniiuagc Skills 


M, C. Wane. S. 


speaktnti: listen- 




students WDrR in 




(. oiijiiiiinicui ion 


Riise. & J. 


ihii: iiiieraelion 




pairs: responses are 




Task 


MaxwelL Thi' 




~" ■ 


reeorded tor subse- 






Dcwlopincut of 






quent seurnig 


















C<>tnniunii u[i<ni 












Sinis Tt'M. 












Pittshiirpli; 












L'niversitv ot 












Pittshiiruli 












Lcuniinu Re- 












search ani-l De- 












velopnietit 












Center. P>73 










l.islciiiiiii ( oni- 


A. Wilkinson. L. 


listcninji 


ages. 1()-11,_13- 


group administered; 




prL'hciisu>n 


Straita ami V. 




^ 14. and 17-lX 


paper and pencil. 




Tests 


Diulley . tismi- 






nv lliple-choiee 












I'orniat 






lif't}\ii iti 7V \ / v 

fit ft.yit'fl f \ ,M'^ » 












Macniillan lid- 




- 








ni'"itiiiTi 1 III 












Hoiiridsiiiills. 












Bassinjistoke 












Hariipsliire. 












Hngland R(j2I 






















M \ ( '( m A I i».ri*n- 

t>J/\k \ 1.1 ML II 


y- Pl iiior W R 


speak iniii listen- 


tirades 3. 6. 


oral speaking test ad- 




IIIU itlUl <7jlLiiN 


t 'nnih 1 \1inr 

^ 1 1 1 nil . 1 - • i' 1 m 1 


iiiu 


and 12 


iiiinisierCd tt) sniall 




int. Tests ^ 


S K:D; l.i>i)se ; 






groups, caeh stu- 










dent responding in 






ffU'tlt Jof 






turn: responses 






\ \ V/' V Kill I' 






tape recorded; 




- 


Ai'hii'Vi'ifU'nt ill 






wriuen speaking 












test and listening 






Sf^i'ukitiii. The 






test group adminis- 






\i ini<iiT\ Ad- 


V 




tered: paper and 






v'i VI »rv C^i immi t - 
\i!^^iiy v^^'iiiiiiiL 






pencil, multiple- 






iee on StiiLlent 






choice format 






Aehleveiiieiit 












Planning and 












Researeh. Al- 












berta liduea- 
























109 Street . Iid- 












nionlon. Al- 












berta. Canada 












T5J 2V2 









14 
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TABL E \ (co nt.) 



Ihstruhieni 






Skills 


Target 


IVll'UC VI J 


Number 


Tiilc 


Source 


Tested 


populations 


A fi rif^ ■ fi 1 "Cr r*Tf 1 f Mi 

/vuiiiiniNU iiiU'll 


21 


Miis^uchu^eits 


Massachu.^clls 


listening 


i-rade. 7-12 


group_ administered 




Assessment i)t 


Dcp'»r:nicni p 1 






with tape recorded 




U'lcii' ^Liik 
DdSIL »7MnN 


P(i Mi"i ( It )n Wtt' 






instructions; listen- 






ri~«'in i\i Mi*. 








ing passages, and 






L:/»'ir«'n 'in<i 
NUill W II ul lU 






multiple-choice 












response 


















02 I K) 










Massiichusctis 


Massachusetts 


speaking 


grades 7- 12 


two-tiered system 




Asscsshicnl of 


Depaniucni of 






with classroom 




Hasiu Skills 


liducation. Bu- 






teachers rating typ- 




Spcakiiii! Tcsl 


reau of Re- 






ical speaking abili- 




search and 






ties, and individual 






Assessment, 






inter>'iews for stu- 






Boston; MA 






dents who fail to 






02116 






pass the initial 












screetiing 


23 


ML*as'.,iC i)i ('i>iii- 


S. C. Ki'.cillo, 


speakini! 


ages :•/: \o 4 


individually adminis- 




nuinicalion 


Children' .\ 




years 


tered, responses 




C'oiiipLieiK'c 


SfH'ci h ami 






tape recorded 






Ctunnitoiintlivc 












CoinpcU'fM'C . 












Unpublished 












(.loct()ral disser- 












Uitit)ri, llniver- 












siiy of Denver. 












1^;74 Univer- 












sity Microfihiis 












No. 75-2210 








24 


NiL'iropoliUdi 


The Psychoh)yical 


lisicnitii; 


grades k 4 


group adininistered. 




AchicvchiciU 


Ciirporation. 






niultiple-choice. 




Tcsis; I.isienmi: 


757 Third Ave- 






paper and pencil 




(\niiprchcnsu>n 


nue, New 






format 






York. NV 












10017 










MiL'i)iL!L(n I'UiUL'Ll- 


Michij:an JiJuca- 


listenitiL! 


grades 4, 7, ami 


group administered: 


lioiiijl Asscss- 


tional Assess- 




!0 


paper ami pencil, 




mcni I*ri)gr.nii; 


iiienl Proyriiiii. 






multiple-choice 




I.islcninti Tcsi 


Michigan De- 






format 






partment of 












jiducation. 












WO. Box 












.^OOOK. Lan- 












sinL!. Ml Am)') 









8 



15 



TAHI:K I (cont:) 



liistniniciu 
Number 


Tide 


Si)urec 


Skills 
Tesied 


largel 
(*opulatiotis 


Mode ot 
Adniinisiraiion 


:() 


NaliDiKil Asscss- 
lucni ot IMuca- 
lioiuil i*n>i:icss 
I'llol Tcsi or 
Spcakiiii: ami 


See. N.A. MeaiL 
rhi' l)i'\rh>f>- 
nwnl of tin hi- 
W n iU v n i f o r 

,-\ WCVW//!,' 


speaking; lisicn- 
ing; allituiles 


age 17 


gri)U[i adniinisiered; 
nuiliiple-ehoicc'.; .. 
paper and pencil 
tiirmai. tape re- 
corded iiisirucilons 




lasicniiiy 


luiHlioiuiI " 
Ci>tntniii:u iition 

C \in\fH'h'IH t' of 

Seventeen 
Veiir/Olils. \'n- 
puhlisheil dis- 
scriaiion. 
L'nivorsiiy of 
iX'nver. 1977 








27 


Ncvv \ork Si ale 
K -'UCiits Coin- 

anuiiaiii>n in 
l-:npllsh: l:is- 
tcriMic ScciiDii 


Division of Kdii- 
calii>nal Tesi- 
ipii. New York 
Suiie l-ilueaiion 
Depunmeni Al- 
bany. NY 
12234 


listening 


grade 12 


group administered; 
examiner reads 
passages aloud: 
multiple-choice 
t'ormal 


28 


New Vi)rk Siaic- 
u iilc Achieve-' 
iiie'iil 

I-Aanunuiiiui in 
Kiiglisli 


Division ot f-dii- 
eaiional *f esi- 
ing. New' York 
Stale Hduealion 
Depart me ni . 
Albany. NY 
12234 


speaking; listen- 
ing 


grade 12 


lor speaking section, 
siudenis present 
brief monologues 
on supplied topics 
In class; listening 
sccrion is group 
administered; pas- 
sages are read 
aloud: iiiQhiple- 
choice formal 


2^) 


Oliphani Tesls; 
Amliiiiry _Svti- 
ihesi/ing Test 
anil AudiiDry 
DiseriniinaiiDii 
Memory Test 


Hdueaiors Pub- 
lishinu Service; 
Canibriilge. 
MA ()2I3« 


auditory memory 


age 7-14 


sounds arc presented 
thai examinee must 
hi)ld In menurry or 
diseriminaie 


30 


Oral Language 
['A'aluaiion 


I:MC' Corporaiion 
Sl Paul. MN 


speaking; listen- 
ing; interaction 


eleiueriiary; 
Spanish. 
[:nglisli 


i ndi V id ua 1 ly ad n n i s- 
lercd: student's 
discussion of sup- 
plied stimuli i> 
tape recorded and 
transcribed 



ie 



9 



TABLE 1 (cont.) 



Insirunicm 
Number 




Source 


Skills 
Tested 


Target 
Populations 


Mode !)(' 
Administration 
























- 


Profile i)t Non- 
verbal SLMisiliv- 
iiy 


R. Rosenthal. 
J. A, Hall. 
M,R; DiMateo; 
PL.,. Rogers, 
and Archer. 
^i'miviviULtiif 


nonverbal decod- 
ing 

- 


grades 3~6; high 
school 


group. administered; 
students view vid- 
eotape or nim; 
multiple-choice re- 
sponse formal 


\ 




Nonverhdl 
Comnuuucalion 
Baltimore: John 
Hopkins Uni- 
versity Press. 




- - 






VRl Reading Syv 
lenis. C)ral 
l.anuuaue Skill 
C'ltisicrs 


Mcarau-Hill; 
New York. NY 
l(M).V) 


listening; nonver- 
bal decoding 


grades k 3 


group admmistercd; 
multiple-choice 
toriiiat 


33 


SRA Achieve- 
nieni Series 


Science Research 
Associates. 

In.< 1 M.irth 

inL , . 1 J J iNorm 
Wackor Dr.. 
Chicago, IL 
f)(K)()f) 


listening: auditory 
discrimination 


grades k-3 


group administered; 
paper and pencil. 

11 1 iT 1 f i ri 1 - 1^ M 1 S 1 T'i ' 

format 


34 


Sequential Tesis 
of {:duCaIional 
PrDiiress; lis- 
tening 


Addison-Wesley. 
Reading; MA 
t)lS67 


listening 


grades i _ 


gr()up administered: 
multiple-choice 
form at 


35 


SianlDrd Aehieve- 
nieni Tests; 
I -istenini! t vmii- 
prehensuni 


Harci>un Brace 
JoVamivieh. 
New_York. NY 
I(M)I7 


listening 


gn.Jes 1 -f) 


group administered^ 
multiple-choice., 
paper and pencil 
tormut 


36 


StantorU liarly 
SeluH)l 
Achievement 
Test 


Harcoun Brace 
Jovanich; New 
York, NY 
l(K)|7 


listening 


grades k I 


group administered; 
multiple-choice, 
paper and pencil 
formal 



10 
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TABLE 1 (contO 



inslrunicnl 






SRills 


Target 


Mode of 


Nufiibcr 


Tlile 


Source 


Tested 


Popalations 


Adniinistratioh 














I "7 

f 


Siiu;uii)na! l-an- 


iL. i:. L onraU. 


speaking; listen- 


grades 1 - 


— - 

includes whole-class 




..... T ./r 

}:uaiic i.isks 


K. K. Kelll- 


ing; interaction 




disei.issiori. arid 






Irow; K: Mere- 






stnictured and un- 






dilh. and J. M. 






structured small 






Fillcrup. Use of 






group diseu.ssiori; 






Siliuuioncl 






taik is recorded 













and trariscrihed 






Tasks in iln In- 


- - - - 


■ ■■- ■ • 








Uo lrJ.M (ina 












TlJiM x er.su.s 












Cainpohsan 












iivluaiitni. Tuc- 












si>n. AK: L'hl- 












versiiy oil 












Ari/.i)na CdI- 












(ece of liduca- 












lion. IW) 








3S 


Speech in (he 


Bureau of Curric- 


speaking; speak- 


grades 1-12 


assessment of spcak- 




C.'lassrDoiii: As- 


ulum Services. 


ing experiehee; 




irig skills iridividu- 




sessrtlem In- 


Penrisylvania 


attitudes 




ally administered: 




Muiiiierits i)t" 


IX'partmenI oj' 






others group ad- 




Spea^.iiij* Skills 


Hducalioh. }}} 






rhihislered; paper 






Markel Street. 






and pencil, multi- 






Harrishurg, PA 






ple-choice forriiat 






I7l2fi 








- - 
39 


TesI i)t Adoles* 


.- - - - 


spciikirig; listen- 


ages 1 1 - 1 K 


speak irig tests indi- 




cent Lijiipiiuce 


Perry Bri>oks 


ing 




vidually adminis- 






Building. Aus- 






tered; listeriing 






(ih. TX 7K7()I 






ICMS group admin- 












istered: paper and 












perieil. riiultiple- 












L'hoiee format 


40 


TesI o\ Listeiilng 


C'onimuhicatiOh 


listening; ni)nvef- 


grades k-6 


group administered; 




Aeeuraey in 


Research Asso- 


hal deciKling 




muitipie-choice 




f *li 1 li 1 ri>ii 
VIII iiii L. 1 1 


i*i'tlriin M fl 
1. 1 ill |l M 1 . 1 ,\.J . 






toriuat 






Box 11012, 












Sail i.ake Cilv. 












UT S41 1 1 








41 




Ti)rranee Tesis ol" 


__ _ _ 

Scholaslic Testing 


creative thinking; 


. 

grades k-3 


individually adminis- 




C'reaiive Think- 


Ser\'icc. Inc.. 


speaking 




tered 




ing: Verhul 


4K() Myer Rd: 










TesI 


Bensenville. 












U.. 60106 








42 


Utah Test o\' Lan- 


Coniniunicatiori 


speaking; listen- 


ages 2-14 


iiidividually adiiiiriis- 




guape Develop- 


Research Asso- 


ing; general 




tered 




ineiU iDireel 


ciates. Inc.. 


language ability 








TesI Version) 


Bi)x I(K)12. 












Salt Lake City. 












UT H4111 









-tg 
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TABLE 1 (cdnt;) 



InMniniont 
NumlKT 


Tiilc 


Source 


Skills 
Tested 


Target 
Popaiaiions 


Mode of 
Adminislraiion 
















VeriHont Busjc 


Vcriiioni Depart- 


speakinLi; lisien- 


grades k - 1 2 


\ ariety ot siiiiulalion 
ijsks and observa- 
tions conducted in 
classrooriis 




ConipcicrK'v 

. SpcakiiiL! aiul 
LiMcniiii: As- 


iulmU ol I'Jiica- 
tion; 

MoiUpclicr. VT 
(1561)2 


iriLi; inieraciu)!! 




44 


Vi'allncr TcM ol' 
prchciiMon 


N. K. Walliicr: 
"The t)cvciop- 
luent of a 

Coiiipreheiision 
Tcsi for Kiii- 
dcriianen and 
Betrinninii I'irsl 
Ciradc:** liitu- 
i OiiifHiil ami _ 
f\\\rTfoT<y\;inyi 


lisieiiing 


grades k - 1 


recorded instructions, 
lisieriirii: passaiies. 








and responses; 
nuiUiplc-choiee 
loriuat: adtiiinis- 
tered to small 
groups 


45 


ScliiH)! Mini- 
imiin ConipL'- 

IlMK'V Tl'nI 


Mi'dsurcmcHt. 
1974, .W. W\ 

WesKido C'oiii- 
MUiniiy 
Syhooh, 
Omalia. NH 


speak iiiLi 


grade in 


students present indi- 
vidual talks io 
group 



Only four of the items arc evaluative, and only one makes rctcrencc to the overall 
adequacy of the measure as a tool for assessing communication competence. This approach 
was adopted for several reasons. First, it is not possible to recommend or condemn instruments 
without knowledge of the specific purposes for which they are being used. An instrument that 
is useful for cvalaatlng program or teacher effectiveness may not be adequate-for placing 
studehts in individualized instruction (Rubin, 1981). Second, in the absence of a cohsensuaily 
acceptable model of competent communication, it is difficult to evaluate instruments' content 
and construct validity. Objectives and competency iis^s adopted in one jurisdi^^^ 
widely Worn those that guide te.st selection in another district. For example, some districts 
emphasize formal, mechanical aspects of vocal delivery (e.g:, Gary Comniunityjchool Cor- 
poration. 1977-1978), while others focus on rurictional aspects of execating communication 
tasks (e:g:, Vermont Department of Educatibn, 1977). At least for the present; selection of 
evaluation criteria and instruments should be conducted at local levels in accordance with 
enlightened community standards: FinaHy, |t is anticipated that the primary Users of these 
reviews will riot be speech communication scholars, but rather evaluation specialists and school 
administrators. The iristrument review form reflects the genera! concerns of this target audience 

with respect to psychometric adequacy and administrative feasibility. 

The section of the instrument review form bri validity is concerned with the extent to which 
the instrument actually measures the skills br knbwiedge it iriterids to measure. Vaiidity mviy 
be determined in many ways and the presence of multiple validity studies Usirig different methods 
arid different target populations strengthens the case tnat the instrument actually measures what 
it purports to measure. 



12 



19 



Prcdk'livv vuinhiy deals with the abiliiy of the instramcnt to predict performance on another 
measarc that is known to be valid and that is theoreticaliy related to the instrument in question. 
For example, a test of ccvriimuhication competence might be assumed to predict success in jobs 
that rely heavily on oral conlriluriicatioh. 

CytfYntm^^^ is similar to predictive validity except that it focijses oh the relaiioriship 

bctweeii individuals* perrorhiarice oh the inslriimeht in quesiioii arid on Oliver irisirunienis that 
measure the same thing. If a group of students were administered a speaking performance test 
and were also rated by their speech teacher; then the correlation on these twc) measures would 
be a test of concurrent validity: 

Cmicm validity indicates the degree to which the c()htcnt of an ihstrumeht represents the 
doniain of knowledge and skills it ihtehds to measure. Contehl validity is iJsUally deterriiiried 
through expert jUdghieril. One comriiori ijielhod is where experts are given a description of the 
_lest ()^H[ectives arid theri asked to categorize each item by these objectives. Content validity is 
measured by the degree of agreement among judges In the category" assignments. 

iUiiiciing a case ior vomtrmi vulidity takes many forms: Construct validity includes any 
experiment that sheds light ()n the nature of the phenomenon that the instrument is trying to 
measure. Factor analysis ()f the items in ah ihstrUriieht is sometiriies conducted in order to 
explore the underlying reL tioriships among the ileriis, Theoretical rribdels about ihepheriomenon 
are used to formulate arid test hypotheses about how the irisirument should operate. For example, 
several listening comprehension tests could He administered to students along with reading and 
intelligence tests, if listening is a unique skill, then the listening tests should be more highly 
correlated with one another than with the tests of the other abilities: A third comn on method 
ibr examining construct validity is the known groups meth()d. Here the instrumeri is admin- 
istered to tW() populations that are knowri to p()sse.ss arid riot possess the kriowledge or skill 
be i rig rileUsUred. The degree to which the iris trUrrierit separates the p()piilatiori irilo the appropriate 
sUbgroUps is a measure of cori struct validity. 

The sectiori of the review form on renahUity reports the measurement accuracy of the 
instrument. 'I'here are various methods for determining reliability. Test-retvst rW/a/;i//7v rnea- 
stircs the stability of an instrument over time: Assuming that the respondents have not been 
exposed to instruction and haye not undergone a major growth in the knowledge or skills being 
measured, they should receive approximatejy ♦he same sc()re on an in.strument at two points in 
time. This is a measure of test-retcst reliability. 

Ih som e c a sc s i ri s t rU rii e ri t s a re d es i g ri ed to have a 1 1 er ri a le forrri s 1 hat are eq U i v a le ri t i ri c 6 ri t e rit 
arid difficulty. The coirelaliori betweeri individuals' scores bri the differerit forms is a lest of 
iilnrfuifc forms rclictbiUty. 

Taking the concept of alternate forms reliability a step further, it is possible to think of an 
instrument as a random set of items, each of which is a "test'' of some part of the content 
domain: The degree to which the respondents' performance on one item is related to their 
pertbrmance ()n other iteins is a measure oi' irncrnal n))isisremy reliuhilu^^ 

Tests of perforrilarice are markedly differerit froril paper arid pcricil tests. For these test.s, 
i n e as li re m e ri 1 1 ak e s place w i i h i ri t he p e r sdri w h o as s ig n s the ra t i rig or score . H e re t he re I i a b i I i t y 
o f t h e s CO re r is at issue, not t he i e I i a h i I i t y o f t h e lest. Sc 'orin^ reHahUUy i s u su a 1 1 y a sse s se d by 
having more than one person rate the same performance. The correlations or percentages of 
agreement in these ratings is a test of scoring reliability. Usually scorers arc evaluated for 
reliability after training but before they begin rating. However, to insure that scorers remain 
consistent over time, it is important to check their reliability during the .scoring process as well. 

As a part of the development of s()mc large scale asses.smcrit iristrumcntf;, normhig or iriierion 
sctfini^ studies are c()ridUctcd arid these are al.S() di.scUs.sed dri the review tbrm. F()r these studies, 
the iristrUmerit js adriiinistered to a large riuriiber of respbriderits arid the re.sUlts provide per- 
formance benchmarks for future users of the instrument. Normirig .studies for standardized 
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achievement test yield charts that trahslorm raw scores into normed scores, most rreqaqill^ 
grade equivalence: Stanoard setting studies are sometimes conducted ^br tests that measure 
mastery oi' specific objectives— criterion referenced tests. Data collected from samples of 
students is Usually compared with data from another source, such as teacher ratings, to determine 
what test scores represent mastery level; A caution for all norniing and criterion setting data 
is that the characteristics of the original population assessed may be difrererit froiri the populatibh 
that the user is assessing. 



TYPES OF RESPONSE AND SCORING PROCEDURES 

Multiple-choice .unnats are the stock-in-trade of standardized testing. Questions are designed 
so that each has a single correct answer: tests can be graded ea.sily by hiachine or template 
without any problems of unreiiability in scoring. Item difficulty is reajdily ascertained and 
conlroHed/ahd test forms can be equated by weil-established methods. Two indirect tests of 
speaking ability attempted to utilize multiple-choice responses (20, 26), but the technique^ is 
widely represented among tests of listening proficiency (1 . 2, 3, 4, 8^ 19^20, 2|, 24, 27, 28, 
32, 33, 34, 35, 40, 44). {Note: Thh hurribers correspond to the instrument numbers used in 
Table 1 and instrument reviews in Chapter 2.) Not only are multiple-choice questions used to 
measure literal comprehension, but also to assess higher order abilities like rccogrtition of 
speaker\ purpose, inference-making, and aspects of critical listening, One of the drawbacks 
of many multiple-choice listening tests is that students must read printed questions arid response 
alternatives, thUs cbriioUridirig listening ability with reading ability. Some listening, tests cb 
this problem by using tape recbrded presentations of questions and response options (21, 25, 

26); Others use pictures instead bf verbal response options (4, 33, 4(3)^ 

Another technique employed in some measures of lisrening skill is behavioral response. In 
particular, this type of performance measure is used iri direction following tasks (I, [I, 16, 
34*. 42. 43). In general, those tasks approximate normal listeriirig actjvity, and thus they are 
more valid than less direct m-asures: However, in some cases the types of behavioral responses 
demanded may be quite artificial (c;g;. 'Tlace a circle around the second largest square^'). 

Speaking 

The most common means lor assessing speaking skill arc perlbrmancc rating scales (4, 13. 14, 

15. l6, l7, 20, 22, 28, 30 3S, 43, 4:). RUbiri (1981) discusses a number of factoni pertaining 
to the use of this techniq^ie in large scale assessriierits. Their niajor disadvantages lie in the 
"poteritial for unreliable scoring and in the relatively large expenditures of staff time: Some 
systems seek to avoid the costs of external raters by having classrbbni teachers evaluate students' 
typical (22) br elicited (28, 38, 43) speech; This approach would seem to exacerbate the problem 
of rating error, arid beg the question of time allocations. 

Alternatives to using pertbrmarice rating scales in assessing speaking ability are techniques 
that take particular discourse features as iridicators of quality of e:cpression. For example, Loban 
(1976) and Mcealcb (1978) both suggest the use bf measures of syntactic C9m^lexity for 
assessing oral pronciency. Some measures of discourse fealUre.s require speech samples to be 
irari.scribed and scored later (4-optional, 30-optional, 37). Others call for on the spot judgment 
of the preserice br'al^encc of specified features, and thus they do not require trariscriptions (7, 

16, 42). In gcrieral, tl^^pes of features measured are essentially linguj.stic such as total riumber 
of words, lexical diver.sit^articulatlon, and sentence expansion (4, 7, 16, 23, .37, 42). Other 
evaluation schemes of this type^mplby a combination of linguistic and whole-text descriptions 
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(1-7, 3Uj. Whble-tcxt dcscnpticms include sucH rubrics as,/*Narrative that goes beyond the 
iriiorriiaiion given in the pictorial stirniilus." Extreme caution is in order, however, in inter- 
preting specific discourse features as indicators of quality: Concurrent validity, wherein such 
features are shown to predict judgments of overall quality, iias rarely bee*i esiablishcd; Indeed, 
evidence has accuhiulated that directly contradicts the use of syntactic complexity in particular; 
as a jiieasure of quality of expression (CrdwhUrst, 1979). 

When speaking tasks are structured in a way that perrriiis objective hieasUremeht of success, 
it is possible to derive measures of communicaiibn efteciivericss. For example, it is j3bss'ble 
to use *\shift of opinion ballots," which ask audience members to indicate their aiiiiudes toward 
li topic both before and after the delivery of a persuasive speech, io measure the effectiveness 
of the speaker: Referential communication tasks (Dickson and Patterson, 1979) measure com- 
rtiuhicatioh effectiveness of a speaker by seeing whether a listener can identify the correct 
object from an array based on the speaker's descrij3tidri of the object. Effectiveness of small 
group cbmrriunicatibri on be evaluated by assigning a uriique-sblutibri problem to a group arid 
then recording the accuracy and speed of their solution. These techniques, however, db not 
elicit uncontaminated measures of individual communication competence because audience 
characteristics, listener skill, and group composition are factors beyond the control of the speaker 
and can affect communication success. The effectiveness of some referential communicat^ion 
tasks, however, can be asses\sed without recourse to measuring listener accuracy. For exampje, 
some ta.sks require the speaker to state the attributes of an object or geometric figure that 
uniquely describe it. Cbmmunicatibn effectiveness is evaluated sirnply by cbUritirig the number 
of critical features that the speaker identifies (Piche, Rubin, and Turner, 1980). 

Conspicuous in its absence from the instruments reviewed is use of interaction cbding systems 
for assessing communication skijl: Such devices are based on ob.servations of naturalistic 
interactions. They include simple sociograms indicating the frequency and direction of com- 
muriicatibri llbw, a.s well as category systems that may classify communicatoi^' messages as 
c<^n,structive or dysfUrictibrial vis-a-vi.s grbUp mairiteriahce and task functions (Bales, 1953); 

GGNTENT OF ASSESSMENT INSTRUMENTS 
IJ.stening 

Listening is not a unitary skill, but it is rather a complex bf subskills, each bf which is brought 
Into play to greater or lesser degrees depending on the nature of the listening task (Lund.steen, 
1979). It is natural, therefore, that tests of listening ability tap a variety of skills. Test lisers 
.should make sure that the listening test selected conforms to their particular measurement 
oHjcctives. 

Most often, listening tests hieasUre literal comprehension of spoken material (I, 3, 4, 6, 8, 
19, 20, 2L 24, 25, 26, 27, 28, 32, 33, 34, 35). It should be noted that comprehension is 
generally confused with recall or releritibri, since que.stiojis tyjiically tollbw sbitic extended 
discourse. Two testing methods alleviate this confusion. The Cloze Test (5) provides verbal 
context that may lessen reliance on memory. Similarly, tests that deliberately select brief 
passages and present few questions for each passage (21) may tax memory to a les.ser extent. 

Many listening tests focus on listening for directions (I , II, 16, 42, 4?), a ^ypc of purpo.seful 
listeriihg that is readily measured by accuracy of behavioral response (e:g:, circling the correct 
item, drawing the prbpcr path oh a map). Other listening skills frequently measured include 
recognition of speaker^s purpb.se (2 L_ 25, 27), making iriferehce.s br interpretations beyond 
material given (3, 4, 19, 20, 21, 25, 27, 28, 32), and .summarizing (25, 43). 

A few listening instruments emerge as covering rather unique content. The Listeriihg Com- 
prehension Tests (19) includc^jiuble.sts retlecting ability to interpret paralinguistic cues arid also 
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ability to rcmlcr social jiiclgmcrits 1mm speech. Several tests not reviewed here also meas^^^ 
scnsiiiviiy to paralingnisiic cues (Siiiiih-Elliot Listeriirig Test; Davit/ arid Mattis, 1964),. An 
orientation to functions of listening (Tor example, to gain information or lo evaluate credibility), 
as opposed to subskllls, is displayed by the NAEP Pilot Test (26) as wei! as the MassachuseUs 
Assessment of Basic Skills (21 ). The NAHP Pilot Test (26) and the Smith-Elliot Test (beaming 
Dynamics, inc.) make soiiie provision to assess comprehension of facial and gestural com- 
iiuiriication. . " . . 

Some of the subskills tested by instruments reviewed here are abstracted frorii liny reasohably 
construcied communication context. These subskills, while critical to communicative listenirig, 
are s(' narrow that they might better be considered receptive language skills. Such receptive 
language skills include, mo^t prominently; vocabulary ( 1 , 2. 3, 4, 6, 8, 1 5, 3 h 38, 40), syntax 
(31 ). and phohehie recognition and discrimination (3, 13, 15, 29, 32, 33, 36, 39, 46); Phonemic 
(liscrimihaiibri ahti ideriiificatioh is viewed as essential for reading readiness, but should not 
he construed as a measure of listening ability. 

Outside of the classroom, the bulk of listening activity lakes place in the course of iriteractibri. . 
When referential communication tasks assessing skil s of both the speakers arid the listeriers 
permit free oral Interchange (9, r(), l8), these tests approximate interactive communication. 
Other instruments mca.sure interactive listening skill more indirectly by including conversational 
speech ani()hg their listening passages (21, 25, 2b). In general, however, interactive listening 
i.s ah area calling tor vigtmnis test development efforts. 

Speakiri}> 

The content oi' speaking assessment procedures is as varied as that of lislenirig tests. Clrie way 
in which this eContent can be categorized Is in terms of mode of discourse. At the elementary., 
age level, mi)sl tasks are either narrative (4, 43) or descriptive (9, 10, 18, 22K A number of 
iesis desigried for rioh-hative .speakers al.so rely on story-telling (l5, 1 7, 30): For older native 
Hnglish speakers, greater, variety is eyident. The tasks often call for exposition in the form of 
extended monologues (13, 20, 38, 43, 45). Other modes of discourse include extended per- 
suasive monologues or simulated persuasive conversations (13, 22, 43). telephone conversations 
(22; 43K IrKroductlons (43), and responding to questions in ari iriterview (13, 23, 43), 

Speech assessment procedure.^ can be categorized in terms of cbriimuriicatipri sitUatibri.s as 
Well. In particular, it is useful to examine the types of audiences that are featured in oi'al 
perlormarice tasks* Of course, students will be aware of the examiner as an ultimate audience. 
However, in the maiority of instruments reviewed, the examiner is the sole audience to whom 
vliideriis speak. Speakers do not typically cohimuriicate in order that their oral proliciency may 
l>e evaluated. Indeed, evaluation usually irihibits cbmriiuriicatiori. To the extent that a.sscssmcnt 
[Mocedures ofl'er no pretense lor speaking other lhari evaluatiori, these procedures yield inac- 
curate .samples of communication perlbrmance. _ __ _ 

A single examiner-aadlence is most natural in interview situations (13, 23, 43). Orie pitfall 
of interview situations is that the interviewer may exert overriding influence on students' speech 
behavior, resulting in considerable unreliability (Mullen, 1978: Hitchman, 1966; Bazen, 1978), 
A sirigle examirier-audiehce is most anomalous in those situations in whic^i students are^called 
upon to deliver a speech to that individual (13). The problem of unnatural audiences is .somewhat 
relieved by procedures thai siriiulate sjtuatibris irivblvirig realistic .speaker/audience relations: 
These procedures may ask students to simulate ari emergency telephone call to a police operator, 
giving directions to a stranger, or persuading a friend to graiit sbriie favor (22,* 43). These 
simulation tasks, however, confuse speech proficiency with role-playirig ability. 

Group di.scussion has been accorded great Importance as an iii.structional technique, arid 
British edUcatbrs have attempted to utilize small group peer interaction as an assesshient situation 
(Barries, 1980). Of the instruments reviewed here, only two sample naturalistic Interaction in 
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peer groups (37. Dviulic rctercntiui communication tasks (lU. 12, 18) also appfoxlmatc 
natural spcakcr/aLidichcc iclatioiis 

Criteria ^ 

A tuuil aspect ol'oral asses ;mcnt content that requires examination is evalaation criteria. Along 
what dimensi()ns ot quality do the instruments renderjudgmehts? Lisicfling tests arc primarily 
coricerned with accuracy -^accuracy oi' recall, of lollowing directions, of perceptions about 
social relationships. Conceivably, tests could be devised that provide information about listening 
activity, as well. Such iristriiments would indicate what type of iistcning (critical, aesthetic, 
inlormative) students arc engaging in during the course of a stimclus passage, and the degree 
of concentration or constructive assimilaiiori that characterizes jheir listening processes.^ . 

Speech assessment procedures exhibit a fair amount of consistency in their ev^l^ationcnter^a. 
Becker (1%2) lound that typical speech nitirig scales renect only three clusters ot>dgmeni 
despite the fact that they may include a larger number of variously labelled criteria: These 
clusters are content, delivery, and language. These criteria, witti the addition of organization, 
account f(j)r i.iost perlbrmance rating scales reviewed in this repoH (13. 20. 22, 28. 45). Despite 
this consistency in the nature of criteria, rating schemes differ .n the weight accbrded each 
criterion and ditTer in the manner in which the criteria are defined. In particular^ instrunierits 
V iry in their treatment of language. Sbiiic instruments weight language most heavily of the 
criteria: while others apportion emphasis more equally among dimensions of qualily . One rating 
scale, for example, devotes three of seven items to aspeci> of language, while the remainder 
concern delivery factors f 13): Procedures that result in single, general impression scores rather 
than analytic judgments v28, 38) by design provide no guidance in how criteria are to be 
weighted^ The definition of language quality adopted by some instruments stresses cOntormity 
to the conventions of stahdanl American English (1 3, 45). Other instruments, particularly those 
designed for non-native sneakers, cbrivey more dctailedMnformation about the types ot gram- 
matical structures mastered (15. 16. 37, 39). 

Just us some listening tests were characterized as so narrow as to qualify more as tests ot 
receptive huiiiuacc, so arc some speaking tests measures of pnductive language: and not 
cbmmuhicatlon. 1*his is certainly true of procedures that a.sk students to imitate words or 
sentences in isolation and then apply criteria that evaluate artieulatiori or grammatical inter- 
tVr-nce of a first language (29, 39. 42): It is no les>^true of procedures that iricbrporate some 
communicative context like ah interview, and then rate speakers on exclusively linguistic 
nrounds (Mullen, 1978) Merely eliciting language by means of a communicative-task does 
not constitute a :-st of communication competence (Carroll: 1980): To repeat an earlier caution 
concerning discourse feaiLfres, it is risky to assume Untested relations between linguistic prop- 
erties and overall quality of expression. 

A lew speaking instramcnts that emphasi/.e language qUality criteria renect the contextual 
and interactive aspects of communication better than many of trie more conventional rating 
scales. These instruments measure the degree to which language is appropriate or adapted to 
the demands of the communication task. For exan;ple. ratings of a response may depend on 
ihc type of question asked (23). Or a test may measure the degree of elaboration,^ not just 
simple labelling, that is expected iri a response to a narrative task (4, 17, 30). Rating scale 
items mav express communication oriented criteria like ^appropriateness-' or "intelligibility;* 
rather than formal linguistic properties like scnterice structure, standard usage, or correct pro- 
riiiriciation. 

ADMINISTRATIVE FEASIBILITY 

ir measures of speakinc and listening proficiency are to be adopted lor large scale assessment 
programs, they must be administratively fcasible. They must not consume excessive amounts 
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ol" papil time; nipst not require unreasonable allocation of personnel for administration and 
scoring, and musi not require highly speeialized training for administrationv scoring, and inter- 
pfetatioh. Unlike other basic skills, however, cbmmuriicatioh is a complex, iritcractivc behavior. 
Thcretore, tests of cbmrriuriicatibri cbmpetehce are apt tb be more expensive tliari many biher 
large scale asNcssmchi procedures. 

Many tests of listening ability are, hov>/ever, amenable to group administration (I, 2, 3, 5, 
6; 8; l9: 2h 54, 25, 26, 27, 32; 3Sj; Even skill at following directions can be assessed in 
this manner ( I, 34, 43): Tape recorcled administration instructions and response options (21; 
25, 26) not only reduce unreliability and confounding with reading ability, but they also 
ccMiiribute to ease of testing. Some listcriihg measures, bh the other hand, allow for a wider 
range of rcsporisc rrtbdes (1 1,15, 16, 43), arid these require individual adrriihisiratibri. 

Assessments of speaking skill cbriducieu as interviews or as extended mbriolbgues naturally 
diMnand individual administration. Moreover, it is advisable to use multiple raters to insure 
re 'lability. This can be accomplished by assigning two staff members for *1ive" rating, or by 
tape recording performances for sabseqaent evaluation by two raters. One instrument attempts 
to reduce the testing burden by requiring cli^ssroom teachers to screen their students based on 
thoif typical classroom communication behavior and to refer only those students '*in question" 
tor individual a.s.scssmcnt (22). However, there is sbrhe evidence that .suggests that these screen 
ratings were subject to bias, arid they are riot reliable. 

It is possible tb reduce adriiiriisiratibn costs by using group commuriication ia.sks, since a 
number of students can be evaluated during the course of, say, a twenty minute discussion 
session (Follurd and Robertson, 1976): Similariy, referential communication tasks (9, lO, l8) 
may also be adapted to simultaneous administration to several dyads. Workers at the University 
of Wi.sconsin -Madison Research and Development Center for Individualized Schooling are 
presently expcrimcritirig with aprbrriisirig application of mirii-cbrnputers which prcschl stimulus 
arrays lor relereritial ta.sks arid record afx'uracy of dccbdirig. The least practical methods of 
oral exainiriat iori are thdse that require siab.sequerit irariscriptibri arid arialysis of speech samples 
(37). Also, seme procedures require raters who are well trained in identifying linguistic struc- 
tures (7, 16, 42). 

TARGET POPULATIONS AND POTENTIAL SOURCES 
OF TEST BIAS 

The instruments reviewed here cover the-eritire K-12 age range, although the eletllchtary grades 
re ce i ve pa rt i c u I ar c m ph -' .s i .s , e.spec i a 1 1 y a m brig com me rc i a 1 1 y de ve Ibpcd i ri.s t rii me ri t s . S e ve ra I o f 
the measures ^iriclude alternate forms that cari be adrriiriistered iri English or in Spanish (15, 
16, 17, 30). Indeed, it appears that .sophisticated advances in commuriication assessment have 
emerged from the field of .second language testing (Carroll, 1980). Only a single instrument 
i.s specifically designated as appropriate for special education popuiatigns ( I j). 

Stiggins (1 98 1) disc&sses a number of source^of bias in communication tcs'ting. Instruments 
vary considerably in their efforts to minimize group bias effects. Some technical manuals 
document the work of minbrity group reviewers who examined item.s iri order tb eliminate 
pbteritiiil bias (26). Other mariiials tabulate riorrriative data separately for black arid white .students 
(4). It .should be hbted_. however, that differences iri ceritral tendency are not, them.selves, 
evidence of test bias. Rather, a test is biased if it over- or urider-prcdicts scores on some 
independently administered criterion measure (Gleary, 1968). In the absence of criterion mea- 
sures of communication quality, it is difficult to ascertain test bias. The majority of instruments 
reviewed here- however, do not address the issue of potential group biases: Indeed, .some 
sc()ring rubrics assign paniciilar weight to standard English dialect patterns, a procedure that 
likely places hbhstandafd dialect speakers at a disadvantage. 
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LOCALLY DEVELOPED INSTRUMEf^tS 

bcvcioning instruments locally for assessing listening and speaking requires eonsiclerable time 
anti ei lbrt as' well as familiarity with measurement and content concerns; Often it is not teasibie 
to sUhmit loca^'y developed instruments to the same degree of technical review for reliability 
or validity as comriiercially developed iostrumerits. However, there are some situations where 
local development of assessment instruments is desirable. For example, a school district may 
adopt a set brspeeiile speaking and listening competencies and develop an inslruclibnal prograrri 
directed toward building those competencies. In order to measure its success, the district may 
find that it is Better to develop a test locally that is tailored to its specific competencies than 
to use existing test's that only measure some of those cornpetcncies or that only measure those 
ebnipeterieies indirectly. The following brief step-by-step descriptions of the developmem pro- 
cess provide direction to'lbeal agencies that wish lb develop their own speaking and listening 
instruments. 

I/istening 

To deterriiirie the listeriirig skills arid type bf listening tasks that are important, local developers 
should begji by defining the types of lisiehirig skills arid tasks that siuderits shbuld be able tb 
perform. In developing this list, developers will find it helpful to review curricular objectives, 
instructional materiais, and teaching practices. They should involve a full range of people 
concerned with the results of the assessment, for example, teachers, curriculum specialists, 
administrators, parents, and students. The resulting list may focus on skills that are important 
tb all listening situations, for example. Understanding main ideas and details. These skills may 
be siriiilar to reading cbrripreherisibp skills. The list may also focus on .specific listening tasks 
that are considered important, for example, listenirig tb drrectibris, listeriirig tb a lecture, br 
listening on the telephone. It is critical that the skills and tasks listed.be as specific as possible 
so that they may be objectively measured. 

The next step in developing listening assessment instruments is to assemble stimuli that the 
students will listen to in the assessment. These stimuli should reflect the listening tasks identified 
in the first step. Listening material may be drawn from existing sources. Natural listening 
riiaterial such as public service ariribUncements, cbmmercials, or news stories make particularly 
good niaterial. It is also possible to write riiaterial that particularly reflects the -tasks identified 
in the first step. Care should be taken to use material that is relatively short, is iriterestirig tb 
students, and does not reHeet a bias toward a particular sex, racial/ethnic, socioeconoriiic, br 
geographic group. 

The actual production of stimulus material may take two forms. The material may be written 
in script fonri so that it may be read aloud by the test administrator; or it may be recorded on 
audiotape of videotape. The advantage of taped materials is that they guarantee standard ad- 
ministration arid allow for variety in stimulus material, sUch as various voices, conversations, 
or sound effects. . _ 

Several possibfe" types of listening items may be developed. The most typical is riiultiple- 
choicc items that ask a question about the listening stimulus and provide several possible 
response options: Another type is short-answer items that ask a question and require the student 
to write a short response: A third type, used for following direction tasks, presents graphic 
material, such as a map, and asks the student to complete a certain task like drawing a route 
onto the map. A variation of this listening item is to describe an object and ask the student to 
draw the object br to select the appropriate bbject from a set of pictures. In all cases, item 
devclbpmerit shbuld follow established standards tor iterii coristrUctibri that may be foUrid in 
measurement textbooks. 
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It is impossible to iclcntil'y all the possible eohi'Using or problematie aspects of stimalas 
niaicriai or items until they have been Held tested with a sample of students who are similar 
to those who will be assessed. The results of f'iuld tesiing may be used to pick the best stimuli 
atui items: Mcasaremcnt textbooks provide some simple techniques for reviewing field test 
data, in addiM()n, field testing provides information about the amount of time it takes most 
students t() complete the items. This information should be used to establish the time limits for 
the final i/ed test. 

Speaking 

Similar to listening instrument development, to determine the speaking skills and types of 
speuking tasks that at-c important, local developers must first define the types of speaking skills 
and tasks students should be able to perform; The steps in this process are the same as they 
are lor listening. The resulting list may focus on specific skills important In all speaking 
situations, for example, speaking distinctly or speaking in an organized fashion; The list may 
also focus on specific tasks thai are considered important, for example, giving directions, giving 
a speech, or asking questions. As with listening, it is critical that the speaking skills and tasks 
listed be as specific as possible so that they may be observed and measured. 

Tw() types of approaches arc used in assessing speaking behaviors. First, in an observational 
approach, the student's behavior may be observed and assessed unobstrusivcly. Second, in a 
structured asscssriicnt approach, the student may be asked to perform one or more structured 
speaking tasks, arid his or her performance oh the tasks is then assessed. 

if an observational approach is taken, the developer must decide what speaking behaviors 
will be observed, for example, asking questions, responding toquesiibns, or speaking in group 
discussions; In addition, the developer needs to decide how many times each student will be 
observed and for how long: The observer may be the regular classroom teacher or someone 
from out.side the classroom, such as a teacher from another grade level, a chairperson, or a 
counselor. • 

If a structured approach is adopted, the developer must decide on what type of speaking 
tasks will be used. Also the developer must decide oh the setting for the tasks: The student 
might be asked lo perform certain tasks in front of the entire class, in a small group setting, 
()r in a one-on-one situation with the assessor. Again, the assessor may be the classroom teacher 
or s()illeohe from outside the classroom; 

Next, a scoring system that describes acceptable and unacceptable levels of performance for 
the speaking skills or tasks already identified in the first step must be developed. The scoring 
sy.stem may involve a two-pbiht determihatioh: the behavior of interest is cipher pri^sent or 
absent, the student can be heard or carinbt be heard. Alternatively, the scoring system may 
define a continuum of behaviors that range from lowest to highest: the student is very disor- 
ganized while speaking, somewhat disorganized, fairly well organized, or very well organized. 
However, when a continuum is used, it is necessary to describe each level of the scale in terms 
of specific behaviors that represent that point in the sca[e. The resulting scoring system will 
be used either for observatioh ratings or structarcd ratings: as determined previously in the 

second step. _ ^ , 

Once the basic approach is established ahd the scoring system is developed, it is necessary 
to train raters in the use of the system. Training should include thorough instruction in the 
categories in the scoring system, provision of examples of performance that represent the various 
categories, and opportunities for the raters to practice rating student perfomriance. Raters should 
have ample opportunity to ask questions about the categories and discuss their practice ratings. 

Often training will lead to alterations in the scoring system, h is possible that some initial 
distinctions made in the scoring system will prove impossible to observe in actual performance. 
Once the system is finalized and raters arc comfortable in their ability to make ratings, raters 
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should he icsiccl ior iiilcniilcr reliability. They should be given several samples of perrormance 
and asked to rale llicm without discussion: The degree of agreement among the raters is a 
measure of hiterrater reliability. Raters should also be trained in test administration procedures — 
either lor libservations or tor strueturai assessments. 

The tinal steps in conducting the assessmehl are data col led ion and scoring. These activities 
may happen simultaneously or ih stages. Ratings niay be made ()ri the spot, or the speak[ng 
pertormancc of students may be audiotaped or videotaped and scored at a later time. The 
advantage of recording perlormance is that it allows for scoring in a more controlled environ- 
ment: 

In addition to testing intcrratcr reliability at the end of training: the reliability of the ratings 
also should be checked during the assessment, If ratings are conducted on the spot: it is necessa!7 
io liave iiiore than one person simullariecni.sly rale students. If ratings are conducted later, it is 
necessary to have more than one person rate the recordings, Checking the reliability does not 
have to occur lor every rating but should be conducted at random for at least 10 percent of the 
ratings. 



SEI.EC TED RESEARCH AND DEVELOPMENT PRIORITIES 
What Should he I ested? 

Ideally we should test what we teach. But in most cases adoption of oral communication 
competencies by state and local educational agencies is a new phenomenon. Moreover, even 
when such competencies are adopted, the extent and fidelity with which they filter down to 
classroom practice is unknown. Therefore, a major research priority in assessment is to deter- 
mine current classroom practices in speech communication; While educators periodically con- 
duct status surveys of classes designated as ••Speech** (c:g:; Brown; et al:; 1979: Rubin; 
!98()b), the need here is for a more comprehensive survey of all English language arts instruction. 

Most probably, we would di.scover that deliberate teaching of oral communication skills is 
largely neglected in American public .schools. Even in Great Britain, where ''oracy** has enjoyed 
greater emphasis in curriculum documents, little explicit .speech communication instruction 
takes place '{Barnes, 1980). Therefore, the content domain of communication as.scssment prob- 
ably cannot be defined by what is taught, but by what ought to be taught. As Barnes (1980, 
p. 125) observes of the British schools; ••Any monitoring of oraey during secondary schooling 
will be proposing a wider range of curricular concerns in oracy than schools presently andenake: 
. , , Thus, in secondary .school.s at least, the monitoring of oracy is likely to be leading practice 
in .school.s rather than responding to it.*' We return in a later .section on educational utility to 
the issue of what Wilkinson (1968) terms '•washback,** - ; 

The specification of a content domain fortesting, then, exeas impact on in.st ruction. Wicmann 
and Backlund ( !98()) describe some of the divergent attempts to define '•communication com- 
petence;** Larson (1978) notes that the definitional problem is the greatest impediment to 
assessing speech communication: Testers who have accepted the definition of the. National 
Speech Communication Competency Project (Allen and Brown: 1976) in constructing measures 
of li.stening anjd speaking .skill were unable to devise .suitable item.s for all of the components 
speciried (McCaleb, 1979: Mead, 1977), Furthermore, all component.s of communication com- 
petence may not be within the proper purview of the public school.s. For example. Weimann 
(1977) includes self-disclosure as among communication competence behaviors, but public 
.schools officials may not believe it is their role to inculcate self-disclosure in their students. 
Thus, it may not be feasible or advi.sable to test the entire domain of communication competence 
(a.ssuming it can be defined in a satisfactory fashion): However: no principles or methods exi.st 
for .sampling from the content domain. 
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An cspcciuliy iroubicsonic issue pertaining to the valiclity ol" ()ral communicatioti tests con- 
cerns the role ol' langiiuge knowicdgc and general verbal ability. Functional communication 
competence, ii is generally agreed; Is the aBility to use codes (verbal and nonverbal) appro- 
priately in situations (e.g.. LUrson. et al.. i^HS): instruments that measure knowledge of the 
conventions of siamlard Kriglish i\vc not tests of commanicmion competence, although proce- 
dures calling lor use ol standard Hnglish in particular contexts may be appropriate (Rubin. 
i^)S()a); Some tests of oral expression utilize communication contexts like story-telling or 
iiucrvicvvs: hut assign scores by emphasizing isolated language skills such as articulation: 
standard grammar ( 14): or vocabulary and sentence expansion (34). Some commercially avail- 
able tests of listening comprehension appear to be little more than measures of general verbal 
ability (Kelly. I%3). 

In 'summary. \Ve propose the following research and development priorities relating to what 
should be tested in measures of speaking and listening proficiency: 

• Conduct comprehensive surveys of classroom practices in oral language arts instruction. 

• Define the content domain of communication competence. 

• Delineate components in the content domain that are not appropriate for public school 
instruction. 

• Devise principles for sampling from the content domain. 

• Develop measures that distinguish between communication competence and general verbal 
ability: 



How Can Criterion Reference Validity Be Determined? 

In ucne-al. existiniz instruments for testing communication competence have not been subjected 
to studies of conc^urrent or predictive validity. One reason for this may be the rapidity with 
which state and local education agencies have needed to set up assessment programs. The lack 
of accepted criteria against which tests may be vaiidatcd constitutes another reason, Ability 
test scores arc one source of information about concurrent validity: but they are not satisfactory 
as the only criteria, lioiistic teacher ratings of general communication^kills such as those 
envisioned" as the first phase in the Massachusetts Assessment of Speaking Skills (22^ might 
prove suitable for this purpose. However, initial data from this project indicated that holistie 
ratings miuht he subiect to bias and unreliability, Sociometric analyses using peer interaction 
datacoukUilso serve as criteria Ibr concurrent validity. Criteria for studies of predictive validity 
could include teacher or job ratings at some later point in time. 

Kstablishinu criterion reterenced validity seems particularly crucial in assessment tasks that 
arc obviously^ contrived solely for the purpose of evaluation: Several assessment procedures 
require students to communicate in role-playing situations (Massachusetts histening and Speak- 
in^ Assessment: 1 980: Rubin, I98()a). While such procedures permit evaluation ol;; hie role 
comiMunication skills: the relationship between role-playing performance and natural com- 
munication pertormance is unknown: Other procedures require interviews or Conversations with 
an asses.sor (e.u.. 14). However. Barnes ( 1980) notes that in British Certificate ot Secondary 
Education examinations, students display different communication behaviors in peer groups 
than in private interviews. 

In summary, we propose the following research and development priorities relating to criterion 
referenced validity in measures of speaking and listening proficiency: 

• Establish criterion measures for measuring concurrent and predictive validity. 

• Hxplore naturalistic criterion measures for these purposes: 

• Investigate criterion referenced validity of contrived communication tasks. 
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Are Measures Reliable? 



RcseurchcTS in written composition have rcccignizcd for some time the multitudinous sources 
of inconsistency in writing evaluation (Braddock, Lloyd-Jones, ana Shoer, 1963), and refinc- 
hients iti scoring procedures have cbriiiiiued to be a tiiajor focUs of research arid devclopnieni 
in that field (e.g.» Cooper and Odell, 1977). The field of speech communication, in contrast, 
seems to have pursued investigations of test reliabilily less vigorously in the past fifteen years 
mubin, 1981). 

Some attention has been given to internal consistency or dimensionality in studies of speech 
rating scales (Becker. 1962), and it appears to be common practice for commercial tests of 
listening skill to report this aspect ()f reliability. But other related issUes have not been addressed. 
F()r e\at1iple, choice of topic in assessriienis of wriitng skhl is a significant factor in students' 
scores (Rosen, 1969). Yd several locally developed measures of speaking proficiency offer 
students a choice of topics (e.g., 14, Rubin. 1980a) with no apparent evidence of cquivaicnce 
between topics. Recently developed tests of listening ability utilize tape recorded stimuli to 
avoid variation in administration (e.g:. Mead, j977); and Wilkinson (1968) (observes that 
interviewer-assessor idiosyncracies can alter performance in speaking asscismenis: Several 
writers have commented that single samples of speech are not reliable indicators of commU- 
nicattoh competence, and that several samples ranging over a varieiy of speech furictions and 
situations should be taken for a fair assessment (Barnes, 1980; Hitchman, 1966). At present, 
though, we lack the sort of precise information concerning liie requisite size of a reliable sample 
that researchers in written syntactic complexity have obtained (e.g.; Crowhurst; 1977). Virtually 
no information is available concerning test-retest reliabilities of speaking and listening assess- 
ment instramcnts; Although individual test developers have no doubt done considerable work 
in establishing training procedures to engender intcrrater consistency, these procedures have 
no', been shared in the literature like the corresponding rater training program in written com- 
munication (e.g., Diedertch, 1974). 

In summary, we propose the following research and development priorities relating to reli- 
ability in measures of speaking and listening proficiency: 

• Deiermine equivalence of varying topics and communication tasks. 

• Deiermine impact of interactive assessor or lest administrator on student performances. 

• Determine size and diversity of speech sample required for reliable indication of com 
pete nee. ' 

• Ascertain test-retest reliabilities of existing instruments: 

• Refine and publish methods of enhancing intcrrater reliability. 

What Measurement Techniques Are Presently Available? 

Clearly the commilfee wishes to ericourage the development of new measurement techniques 
or the adaptation of research methodologies for purposes of evaluation. However, it is worth- 
while examining some of the strengths and weaknesses of those already available. The most 
optimal assessment procedures are those that arc least intrusive: Naturalistic observation, and 
even classroom teacher ratings, however, introduce problems of rater bias and problems of 
consistency in tasks and interactants. 

Indirect tesl.s of speaking ability would alleviate many sources of inconsistency. In one 
notable effort (20), it proved quite difficult to construct suitable items. Moreover, indTect tests 
may be contaminated by extraneous factors like reading ability and "test-wiseness." Also; 
such indirect tests are likely tolexert deleterious ''washback" effects on speech communication 
instructional practices; and thi!^ can lead to a focus on rote knowledge rather than internalized 
skill (Rubin, 1980a). 

23 



30 



Direct tests of listening ability present fewer problems involving cohsislericy, partieiilarly 
whcii test stiiiuili are tape rcairdeci. Kven so; mcastircs oi* iivt. ning in conversation are elusive. 
Interactive listening— where the listener is an equai convers.aional partner who responds and 
is cvbr ready to switch into the roh- orspeaker— probabiy ealis on diilercnt skills than procedures 
in which test-takers listen to a tape recorded c()riversatioh\ and yet diiTcrent skills iVuni pro- 
cedtires in which test-takers demonstrate their iiriderstahdirig of oriil c()mmands. It is p()ssiblc 
that rererential c()nimanication accuracy tasks (Dickson and Patterson, 197^9) could be adapted 
as elTective procedures for assessing interactive listening skills. 

Sohie scoring procedures adopt objective metrics that are taken as indicators of communication 
quality. Frequently these objective metrics are jinguistic variables: McCaleb J 10 
stance, suggests the use of f-Uriit length, ah index of syntactic complexity, as one 0!' several 
measures oT speaking proficiency. Other writers, however, have pointed out that syntactic 
complc\iiN' varies with each communication task, arid it is not directly related to quality ol 
. nre- mi (Crowharst, i W)). In addition to linguistic variables, other objective metrics arc 
mcu* • js of vari()US features of message a)ntent, such as narrative elements not expjicitly 
depicicd in a stimulus picture. Little is known of the criterion referenced validity of such 
objective message variables. 

In tests of speaking ability, use of rating scales predominate: Typically, rating scales arc 
applied to either extended talks or interview' situatiori.s. In British Certificate ()f Secondary 
Education examinations, oral reading and conversaliori are the liiost c()mm()n speaking situations 
to which rating scales arc applied (Hitchman, 1%8). It uould be useful to adapt the Use of 
ratiim .scales to le.ss intrusive, more interactive coinmunication situations like small-groUp 
discLissioris (Barries, 1980). 

In summary, we propose the tbIUnving research and development priorities relating to pres- 
ently available measuremerit techniques: 

• Hnhance reliability of naturalistic observation procedures. 

• Develop mea.sures of listening in interactive situations: 

• Hstablish criteritm reterenced validity of objective linguistic and message content features. 

• Extend performance rating .scales to less intrusive. m()re interactive communication situ- 
ations like small-group discussion. 

Are Instruments Susceptible to Group Biases? 

Consistent group differences in test scores are not, in and of themselves, eviUerice ()f test bias. 
Rather, test bias can only be ascertained by determining if an instrument over- or undcr-predicts 
a particular group^s perlbrmance on some criterion measure: As discussed in a previous section, 
however, we presently lack ariy universally accepted standards ior criterion relercnced validity 
of communication as.sessment procedures. 

Nevertheless, culture bound evaluation materials will likely favor one cultural group o\;cr 
another. Such materials may include culture bound commuriicati()ri contexts (e.g., role-playing 
a business executive), evaluation criteria (e.g., standard English proriUriciatiori arid .syntax), 
and test ' :irii(jli (e.g., **Point to the grandfather clock'*). 

A less obvioUs source of potential bias against particular cultural groups is the very notiori 
of an oral communication assessnlerit. Gay and Abrahams ( l973) claim that black youngsters 
generally construe the requirements of direct questioning by adults differently than do white 
middle-class children. Similarly, Philips (1970) describes .socialization patterns among Native 
American Indians that render an oral communication assessriicrit as an anomalous communi- 
cation contest: In addition to biases against particular cultural groups, it is possible that com- 
muriicatiori assessment procedures may treat particular individuals differentially. Certainly 
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inciividaals with organic speech defects will not be subjected to the same testing prbccdijrcs as 
others: 

However, cerlaih personality tiaits riiay likewise cause communication disorders. Most well- 
kiiowh ahioiig these is cbminuriicatibh appreherisibh (McCroskcy, 1977). Will special provisions 
be made lor coinmunication apprehension, or, if riot, will the public schools be conimiticd to 
'^remediating" this condition as a part of their responsibility to prepare students for commu- 
nication competency assessment? 

In summary, wc propose the following research and dcvclopnierit priorities relating to test 
bias in procedures for evaluating speaking and listening proficiency: 

• Develop criterion measures with which test bias may be determined. 

• Identify culture-bound communication contexts, evaluation criteria, arid stimulus materials. 

• Determine the degree to which oral conlrilUriicaiibri asscssriierit is inherently biased against 
particular cultural groups. 

• Clarify the status of personality traits vis-a-vis test bias. 

Should Commuhicalidh Competence Be Assessed? 

One rcasbri to assess cbrrimuriicatibn cbmpeierice is not to certify proficiency among individual 
students, but to evaluate oral communication instruction (McGlone; 1973): Of course, if our 
supposition of negligible communication in>truction is accurate; then this motive is obviated. 
Another reason for assessment; however, is to encourage and guide the innovation of speech 
communication instruction: Testing tends to legitimize a teaching fiqld, arid test specifications 
may ''washback" (Wilkinson, 1968) to iristructibrial practice. Thus, important questions pertain 
to effects of commuriicatibri as.scssmcnt on teachers' arid adrriiriistratbrs' attitudes tdw'ard the 
legitihiacy bf speech cbrrimuriicatibn, effects on curricular innovation, and effects on classroom 
practices. These cjuestibns cbncerri the utility of measurement efforts. 

One negative effect of any testing program is the deterioration of student attitudes: Partly 
this is a function of the ends to which test results arc put. Given the generally dubious psy- 
chomctj-ic adequacy of most present speaking and listening instruments, it would seem ra.sh tb 
use them for decisions of great consequence. In any event, it is worthwhile irive.stigatirig whether 
any potential benefits of evaluating cbmmuriicatibri skills are bff^.set by negative artitudinal 
outcbriies. 

Because large scale assessment of speaking and listening skills is not widespread, little 
information is available concerning its effects on institutional allocations: What are the costs 
of oral communication testing in terms of instructional hours lost; personnel hours expended, 
and dollars spent? It is indeed likely that many administrators do not encourage large scale 
direct measurement of speech communication competency because they fear it will be too 
costly, Wc lack cost-effectivcricss studies such as thb.sc that have bccri cbriducted iri cbryurictibri 
with direct cvaluatiori of writirig ability (e.g., HUdsbri and Veal, 1979). Again, such cost 
estiriiatcs must be weighed against prescritly uriquanti liable utility. 

Iri summary, wc propose the following research and development priorities relating to the 
utility and advisability of procedures for assessing speaking and listening proficiency: 

• Ascertain whether measures arc sensitive for purposes of assessing instruction impact. 

• Determine effects of assessment programs on teachers* and administ rators' attitudes toward 
communication education. 

• Determine the curricular and instructional results of asscssmcnc programs: 

• jdcntify the ends toward which test results are put. 

• A.sccrtain effects of evaluation of students* attitudes toward cbriimuriicatibri. 
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IL Reviews of Oral Comihiiiiieation Assessment 
Instruments 



this chapter provides individuai reviews of forty-five testing instruments that assess orai com- 
munication skills. In some cases, the primary piirp • of an instrument is to assess a particular 
oral communicatidri skill. In other cases, ah ihsiru >sesses some facets of oral commu- 
nication in cbhjuhctiori with other skills. 

i. Brown-Caflsoii Listening Test 

AGE RANGE: Secondary, college, and adult. 



SKILLS TESTED: Speaking 
X Listening 
Interaction 

— Visual Encoding 

— Visual Decoding 
Subskill or Attitude 

COST: Not specified. 



TIME REQUIRED FOR ADMINISTRATION: Fifty minutes. 

DEseRiPTlON OF TEST, PROeEBURES, ITEMS, SCORiNG: The test assesses: 
(1) immediate recaii» (2) following directions, (3) recognizing transitions, (4)^etogni2lng 
word meanings, and (5) lecture comprehension. The test administration is ofaL Two'fbmis 
are available (Am and Bm). Each fbnn iricl:ides seventy-six multiple-choice itemis. 

NORM/CRITERION DATA: The test was normed on a sample of approximately 8,0(30 
secondary level students and 300 college fresh it; , The high school sample was fairly 
representative of the national population with re .rect to age and ability level: 

VALlbltV 

Predictive: The test correlated .21 and .28 with high school rank and .41 with honor point 
ratio. 
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Concurrent: Correlations with tests of mental ability ranged from .69 to .78 among high 
schooi. students and from 21 to :55 among college students. Cprrelatipns with reading 
tests ranged from .47 to ;6(3 among high school students and from .31 to .38 among 
college students. 

Content and Iteni Selection: Test content was based on professional cnteria, research, 
and expert judgment. A large sample of items was originally created and the best items 
were seiected based on the results of a series of field tests. 

Construct and Other Empirical Studies: No information provided. 
RELIABILITY 

Alternate Formss Alternate forms reliability yielded a median estimate of :78: 

Test-Retest: No infomialion provided. 

Scoring: Not applicable. ^ 

Internal Cbhsistency: Split-half correlations ranged from :84 to :9d: 

EVAbtJATlVE REACTIONS 

Practicality: The test is simple to administer, it provides useful subscales, and seerris 
practical for both classroom and research use: 

Validity For Specific Purposes and Populations: The test has been used with a large 
number of different cultural, ethnic, educational, and social groups: Its subscales provide 
specific information about component skills. 

Reliability: Evidence indicates test reliability is good. 

Overall Adequacy: The test may be tapping more into general intelligence than it is into 
listening. It is possible to view the two constructs as independent. This test does not do 
that. 

MATERIALS REVIEWED: Brown, J. I., and Carlson, G, R_ fir^)vv/^C^^^^^^ tistening 
Comprehension Test: New York: Harcourt Brace & World, 1955. 



OTHER REFERENCES: None. 
REVIEWER: John Daly 
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2. California Acfiievemenl Test; Listening for 
Information; Level 10 



AGE RANGE: Kindergarten . 




SKILLS TESTED: — Speaking 
X Listening 
Interaction 



Visual Encddihg 

Visual Decoding 



^ Subskill or Attitude 



COST: Not specined. 



TIWE REQUIRED FOR ADMINISTRATION: Twenty-five n nates: 

DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test measures 
(!) school vocabulary, (2) ternis related to space, direction, arid Ibcatibri, arid (3) the rela- 
tionship between facts arid cbricepts. Jt coritains sixteen multiple-choice items. The examiner 
reads a short story to the students. The students are asked to pick the picture out of three 
choices that answers a question about the story. 



NORM/CRITERldN DATA: The test was normed on a national sample of public arid 
Catholic school students. About 200,000 students were involved overall. Nbrmirig was 
conducted in both fall and spririg. 

VALIDITY 

Predictive: No information provided. , 
Concurrent: No inrdrmatioh provided. 

Cbriteht and Item Selection: No information provided. \ 



Construct and Other Empirical Studies: No information provided: 



RELIABILITY 



Alternate Forms: Not applicable. 



Test-Retest: No information provided; 
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Scoring: Not appiicablc: 

Internal Consistency: No info*^ .laiion provided: 

EVALUATIVE REACTIONS 

Practicality: The test is group administered. Easy lb follow instructions are provided. 
Numerous scoring procedures are availabje. 

Validity For Specific Purposes and Populations: inadequate information to judge. Ad- 
ditional information may be available or forthcoming. 

Reliability: Inadequate information lb judge. Reliability information is pubiished in a 
technical manual but was not available to the reviewer. 

Overail Adequacy: Listening is a small priority in this achievement series, and is only 
assessed in the first level as a prereading skill: 

MATERiAtS REVIEWED: CaUfornia Achievement Tesu, Monterey, CA: CTB/McGraw- 
HilK 1977. 

OTHER REFERENCES: None. 



REVIEWER: J. £. McCrosky 
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3. CIRCUS Listening Test 

AGE RANGE: Kindergarten-gradc 3. 



SKILLS TESTED: _ Speaking 
X Listening 

— - Inleraclion 

— Visual Encoding 

Visual Decoding 

Subskill or Altitude 



COST: Hand scored edition $U.25; machine scored edition $18.50. The basic package 
Includes Ihirty-fivc booklets, teacher's edition, and user's guide. 



TIME REQUIRED FOR ADMINISTRATION: Forty minutes 



DESeRlPTlON OF TEST, PROeEDURES, ITEMS, SCORING: The test measures abil- 
ity to listen to a story, understand and interpret events in it, remem'^er sequence of events, 
and understand vocabulary. All items are drawn from a story read by the teacher about a 
circus. The story is presented in parts and the child marks one of four pictures in response 
to a question. Two forms of the listening test are available (C and D). 

NORM/CRITERION DATA: A sample of over 15,000 children was used for norming Form 
C. Over 14,000 children were assessed for norming Form D. Data used for norming were 
weighted according to variables by which the sample was_ stratified: region, size of com- 
munity, socioeconomic status, and proportion of minority population: 

VALIDITY 

Predictive: Samples of children who took one level of the test in the fall were administered 
a higher level test ttie following spring: the correlation was .75. 

Cbiicurreiit: Teacher ratings were collected through the Chiid Competency and Learning 
Inventory as ah independent measure of the same abilities measured by the lest. Correlations 
between teacher ratings and the listening test were .47 on Form C and .43 on Form D. 

Content and Item Selection: Item selection was based on the judgment of early childhood 
experts with a view toward assessing domains which are of interest to teachers and which 
can be effected by curricula: Eacfi form went through two pretest examinations. Final 
items were selected based on rigorous evaluation: 
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eonstnict »nd Other Empprical Studies: The authors point to the relationship among 
various forms ci the test as evidence of construct validity. Correlation between Forms C 

and b was ;75: 

/ 

RELIABILITY 

Alternate Forms: The correlation between Forms C and D was .75. 

test-Retest: efiildren were adrninistered separate forms of the test in fall and spring. The 
correlation was :75; 

Scoring: Not applicable. 

Internal Consistency; The average inter-item correlations for Form C were .85, .80, .79 
and for Fomi D were .79, .78, .81 . 

EVALUATIVE REACTIONS 

Practicality: Group administration and optional machine scoring makes this test practical 
for large scale assessment. Instructions are clear and the test can be easily administered 
by teachers. 

• ? 

Validity For Specific Purposes and Populations: Evidence indicates that test validity is 
good. However, this test does not provide an opportunity for children to listen and respond 
in a conversational way. 

Reliability: Evidence indicates that test reliability is very good. 

dveraii Adequacy: This is a well designed lest with a rigorous research base, for assessing 
general school readiness. It is not a test of speech communication ability, but a paper and 
pencil test. Listening measured |n this manner correlates with reading ability. The rela- 
tionship to the ability to talk with or inform others is unknown. 

MATERIALS REVIEWED: Educational Testing Service, C/RCUS Listening Test, Reading, 
MA: Addison-Wesley Publishing Co:, 1979: 



OTHER REFERENCES: None 
REVIEWER: Janice Patterson 
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4. CIRCUS Say and Tell 



ACE RANGEj Preschool-grade 3; 



SKILLS TESTED: jjL Speaking 
— Listening 

Interaction 

Visaai Encoding 

Visual Decoding 

' Subskill or Attitude 

) 

COST: $5.50 lor ten booklets. 



TIME REQUIRED FOR ADMINISTRATION: Not specified 



DESeRlPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test measures pro^ 
dact/vc ianguacc. There are four levels. Level A is appropriate for preschool and kindergarten. 
Levels B through D are identical and appropriate for kindergarten through grade 3. Each 
levjll has three parts. The test is individually administered: For Part 1 the child describ^js two 
obicct.s: one in a structured response situation and one in a free response situation. In Part 
II jthe child i.s shown pictures and is a.sked lo generate respon.ses that require the correct use 
of/plurais, verb tenses, prepositibn.s, subject-Verb agreement, comparatives, possessives, and 
cdnjunctions (for levels B, C, and D). Responscsjd Parts j and II are scored as correct, 
pirtiaiiy correct, or incorrect based on protocols. For Part 111 the child is shown a picture 
and asked to describe It; Responses to Part III are scored in terms of number of words, 
fVjmbcr of different works and presence of .several qualitative criteria, .such as "naming at 
liast tour objects or characters:" 



NQRM/CRITERION DATA: Level A was normed with a .s;imple of 227 preschooicrs and 
^41 kindergarten students. Level B was normed with a .sample of 805 students mostly ages 
/6 and 7. Additional norm data may be available. 



VALIDITY 
Predictive: No information provided. 
Concurrent: No information provided. 
Content and Item Selection: No information provided: 
Construct and Other Empirical Studies: No information f r. vided. 
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REblABiLlTY 
Alternate Forms: Not appHcabje: 
Test-Retest: No infoiTnation provided; 
Scoring: No information provided. 

Internal eohsistency: Average inter-item correlations range from .49 to,90 for the various 
parts of Form A. Additionai information on other forms may be available. 

EVALUATIVE REACTIONS 

Practicality: The test must be individually administered. However, clear directions are 
provided. Scoring is somewhat cumberson. 

Validity For Specific Pnrposes and Populations: Inadequate information to judge. 

Reliability: Evidence indicates test reliability is adequate: 

Overall Adequacy: The test provides an adequate sample of children's productive lan- 
guage. 

' 

MATERl. LS REVIEWED: Educational Testing Service, CIRCUS Say and Tell heading, , 
MA: Addison-We.sley, 1979. 

? 

OTHER REFERENCES: None. ^ 
REVIEWER: Nancy Mead 
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5. Cloze Listening Test 



AGE RANGE: Seeondary. 



SKILLS TESTED: _ Speaking 
X Listening 
Iriteractjoh 

— Visual Ehcbdihg 

— Visual Deccklihg 
_ Subskill or Attitude 



eOST: Tapes cost $25 per set. Test forms cost $1.50 per form. 
TIME REQUIRED FOR ADMINISTRATION: Twenty minutes: 



DESCRIPTION OF TEST, PROeEDURES, ITEMS, SCORING: The test measures: 
(1) recall of specific information, (2) ability to grasp the thought of the passage as a whole, 
(3) ability to apply the limited number of contrastive units which identify the word patterns, 
and (4) grammatical structures of spoken American Engiish: A short fictional episode is read 
aloud (about ten minutes in length). Then several excerpts are read aioad by the same narrator. 
Within each excerpt several words (riburis and main verbs) are replaced by a chime. Students 
write the missing words on their response sheets. About 40 percent of the selection is included 
in each excerpt. The test includes two forms (Lisbon and Waco). 



NORM/CRITERION DATA: The test was normed on 636 students in ten runs. 



VALIDITY 



Predictive: No information provided. 

Cdricurrent: With 107 students, the Lisbon form correlated .71 with the Browii-Carlsbn / 
Listeriirig test. Form Am, Part E, Lecture Comprehension. j 

Cbhteht and Item Selection: The test was reviewed by ten carricalam specialists wl^o 

judged that the test measures the content it intends to measure. 

... . ..... . / . 

Construct and Other Empirical Studies: With forty-six subjects, the Lisbon form/cor- 

related .79 with the Terman-McNemar Test of Mental Ability, Form C. / 



n 
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REtlABltlTY 



Alternate Forms: With eighty-three students, the two forms correlated ,92, With 130 
students, the two forms correlated .87, 

Test-Retest: No information provided. 

Scoring: Not appiicabie: 

Internal Consistency: For the ten norming runs, average inter-item correlations ranged 
from ,83 to .96. i 

EVALUATIVE REAeTlGNS 
Practicality: The test is easy to administer and score. 

Validity For Specific Purposes and Populations: Evidence indicates test validity is ad- 



Reliability: Evidence indicates test reliability is very good: 

Overall Adequacy: The test only measures exact recall. It does not measure higher level 
listening comprehension skills. 

MATERIALS REVIEWED: Bowdidge, J. S, Cloze Listening Tesu Sprlngneld, M0: Drury 
College, 1967: (ERIC Document Reproduction Service No: ED 091 761) 



equate. 



OTHER REFERENGES: None 



REVIEWER: Nancy Mead 
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6. Gomprehensive Tests of Basic Skills^ 
Tests 2, 3, and 4 



/ 



AGE RANGE: Early elementary. 



SKILLS TESTED: 



Speaking 



JL Listening 

interaction 

Visual Encoding 

^ Visual Decoding 

_X. Subskill or Attitude: auditory discrimination 



COST: Not specified. 



TIME REQUIRED FOR ADMINISTRATION: Test 2, twenty-one minutes; Test 3, nine- 
teen minutes; Test 4, twenty-one minutes. 

DESeRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: Test 2, Sound Rec- 
ognition, measures sound recognition in three ways:.(l) by saying two words (e.g^., lake 
. . . lake or cap . . . cup) and asking the student if the word.s are the same or different, 
(2) by asking which word (.stated and shown in pictures) begins with the same sound as the 
stated word, or (3) by asking which word (stated and shown in picuresj rhymes with the 
stated word. In Test 3, Reading Vocabulary, the student is given an oral definition and has 
to match it with a picture or a word. In Test 4, Reading — Oral Comprehension, the student 
hears a story and has to answer a question about it by picking the correct picture. 



NORM/CRITERION DATA: No information provided. 



VALIDITY 
Predictive: No information provided: 
Concurrent: No information provided. 
Content and Item Selection: No information provided. 
Construct and Other Empirical Studies: No information provided. 
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RELIABILITY 
Alternate Forms: Not applicable. 
Test-Retest: No Information provided: 
Scoring: Not applicable. 
Internal Consistency:' No information provided. 

EVALUATIVE REACTIONS 

Practicality: The test is easy to administer and uses -nachine scoreabie answer booklets. 

Validity For Specific Purposes and Populations: Validity studies are in process but at 
this time information is inadequate to judge. 

Reliability: Reliability studies are in process but at this time information is inadequate to 
judge. 

dveraii Adequacy: The test is designed as a reading readiness test. It is less useful as an 
oral communication test. 

MATERIALS REVIEWED: Comprehensive Test of Basic Skills, Monterey, CA: CTB/ 
McGraw Hill. 

OTHER REFERENCES: None. 

REVIEWER: Nancy Mead 

\ 

\ 

\. 

\ 

v 

\ 

\ 
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1. Coifimuiiicative Evaluation Chart froffi 
Infancy to Five Years ^ 



AGE RANGE: infant-kindergarten: 




SKILLS TESTED: X, 



Speaking 
Listening 
Interaction 



Visual Decoding 



Visual Encoding 



Subskill or Attitade: physical development and visaal-motor- 
percepiaal skills 



COST: Not specified. 



TIME REQUIRED FOR ADMINISTRATION: Not specified 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test is divided into 
two parts: language Use and physical development. The purpose of the test is to determine 
cjuickly if a child should be referred to a specialist for further testing, therapy, or education. 
Items that assess language focus on ( I ) coordination of the speech musculature, (2) development 
of hearing-acuity and auditory perception, (3) acquisition of vowels and consonants, and 

. (4) growth of receptive and expressive language. The test administrator indicates + if the 
skill is present, - if not present, ± if it fluctuates: Numerous minus or fluctuation markings 
indicate further evaluation may be necessary. 



NORM/CRITERION DATA: No information provided. 

VALIDITY 
Predictive: No information provided. 
Cbnciirrent: No inforrriatidn provided. 

Content and Item Selection: Some items were compiled from sources such as Gesell, 
Benet, Caltell, and others. Other items were included because they were believed to be 
diagnosticaHy significant in working with young children: 

Construct and Other Empirical Studies: No information provided. 
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Reliability 

Alternate Forms: Not applicable: 
Test-Retest: No iriformatibri provided. 
Scoring: No information provided. 
Internal Consistency: No information provided 

EVALtJATIVE REACTIONS 

Practicality: The test requires no training to administer. It is inappropriate for large scale 
assessment because it calls for individual evaluation. 

Validity For Specific Purposes and Populations: Inadequate iriformatibn to judge. 
Reliability: Inadequate information to judge, 

bverali Adequacy: The test is judged to be poor because there is no explanation of item 
development, vaiidity, reliability, norming, and score interpretation. 

MATERIALS REVIEWED: Anderson, R: M.; Miles, M.; and Matheriy, P. A. Commu- 
nicative Evaluation Chart from Infancy to Five Years: Cambridge, MA: Educators Publishing 
Services, Inc., 1963. 

OTHER REFERENCES: None: 

— - - ii 
REVIEWER: Janice Patterson 
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8. DurreU Listening-Reading Series: Pfimafy, 
Intermediate^ Advaheed Levels 



AGE RANGE; Primary, grades 1-3; Intermediate, grades 4-6; Advanced, grades 7-9. 

SKILLS TESTED: — Speaking 
X Listening 

Interaction 

Visual Encoding 

Visual Decoding 
Subskill or Attitude 



COST: SI 5. 75 to $19.65 for thirty-five copies. 

TiME REQUIRED F0R ADMiNlSTRATIQN: Primary, seventy minutes; Intermediate, 
eighty-five minutes; Advanced, eighty minutes: 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test measures com- 
prehension in both the listening and reading rnodes. The test has Ihree levels^ prirhary, 
mtermediate, and advanced. Each level is available in two forms (DE and EF), Each level 
has four parts: (1) Listening vocabulary, (2) listening comprehension of sentences (primary) 
or of paragraphs (other levels), (3j reading vocabulary, and (4) reading comprehension of 
sentences (primary) or of paragraphs (other levels): The listening and reading tests are parallej 
in cbriteht and difficulty to allow for comparisons. The listening tests do not require reading 
or writing ability for responses. 

NORM/CRlTERiON DATA: Each level was standardized and nbrmed on a population of 
students with normal intelligence and population from average socioeconomic and educational 
backgrounds: The sample consisted of 22,247 students. 



VALIDITY 

Predictive: No information provided. 
Concurrent: No information provided. 

Content and Item Selection: Vocabulary words were selected to represent categories in 
Rbget's Thesarus arid assigried to levels based on word lists and field test results. Standard 
item statistics were used iri selecting items. 



eonstract and Other Empirical Studies: the li^tehihg subtests correlated .47 and .52 
with the Nlelropolitan Reading Readiness test in grade J, from .15 to .65 with the word 
knowledge and reading subtests of the Metropolitan Achieyemerit tests in grade. 2 through 
6. and from .48 to ;76 with the Iowa Test of Basic Skills in grades 3 thrbu|h 6. Tte 
corrclatibhs between the reading subtests and the other reading tests listed above were 
higher than those between the li.stening subtests and the other reading tests. 



RELIABILITY \ 

- - \ - . .- 
Alternate Forms: Listening and reading tests were equated and then feversed to create the 
alternate fbnh. However, no specific alternate fomib studies were cited. 

Test-Retest: No ihformalibn provided. 

Scoring: Not applicable. 

Internal Consistency: Spiil-half correlations ranged from .92 to .97 for total listening in 
grades 1 through 8. . 

EVAbtjATIVE REACTIONS 



Practicaiity: The test is group admihistered. The rriahual gives specific instructions: 

Validity For Specific Purposes and Populations: Evidence indicates that listening is 
somewhat different from reading which suggests jhanhe liste not be a 

good measure of reading potential — a use that is promoted by the authors. 

Reiiabiiity: Evidence indicates test reliability is very good: 

dveraii Adequacy: The test is limited in its coverage of listening skills. The test emphasizes 
tasks that are similar in iistenmg and reading and does riot address tasks that are more 
typical of just the listening mode: 

MATERIALS REVIEWED: Durfell, D. D., and Hayes, M: T: DurreU Ustenin^-Readln^ 
Series. New York: Harcourt, Brace, Jbvaribvich. 1970. 



OTHER REFERENCES: None 



REVIEV17ER: J. C. McCrbsky 
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9. Bymic Task^Orienfed Communjcation 



Listening 
— Interaction 
_ Visual Encoding 

Visual Decoding i 

Subskill or Attitude 



COST: Not specified. 

c 

TIME REQUIRED FOR ADMINISTRATION: Not specified 

DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test is a proficiency 
test for non-native English speakers: Students are paired. One student is given a task and 
the other student musj respond Individuals are tested with several partners. Responses are 
tape recorded. Ta^ks include requests, manipulative instructions, and descriptions: In general,' 
criteria for evaluafen are (1) time ari_d_ accuracy and (2) comparison with native speakers! 
Tasks are described ai^d graded by difficulty. Some specific scoring criteria are provided, 

NORM/CRITERION DATA: No information provided. 

VALIDITY 
Predictive: No information provided. 



Concurrent: No ihformatibh provided, 

Cdhteht and Item Selection: No information provided. 

Construct and Other Empiricai Studies: No information provided, 

RELIABILITY 
Alternate Forms: No information provided: 
Test-Retest: No information provided. 
Scoring: No information provided. 



AGE RANGE: Elementaiy: 



SKILLS TESTED: X 



Speaking 



19 
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Internal Consistency: No ihrorrhatioh provided. 



EVALUATIVE REACTIONS 

Practicality: The test is practical for use in a classroom setting, 

Vaiidity For Specific Purposes and Populations: Inadeqaate information to judge. 

Reliability: Inadequate information to judge. 

Overall Adequacy: This type of lest could be used with native speakers also. Tasks would 
- have to be harder and criteria made more difficult, 

». 

MATERIALS REVIEWED: None. ; 

eXHER REFERENCES: Findley, C. A. Dyadic Taskmented Cortimunication: Exercises 
for Teaching and TestinJ in the Elementary ESb Glass, 1977. (ERIC Document Reproduction 
Service No. ED 145 629) 

REVIEWER: Nancy Mead ; 



c 
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10. DYCOMM: Dyadic Cdminuiiicatidn 



AGE RANGE: Not specified: 



SKILLS TESTED: X Speaking 
X Listening 
X Interaction 

Visual Encoding 
X Visoai Decoding 
■ — Subskill or Attitude 



COST: User produces materials and scores. 



TIME REQUIRED FOR ADMINISTRATION: Described below. 

<) 

DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test measures com 
munieation sidllsjn five areas: (1) word identification, (2) sentence processing, (3) giving- 
receiving directions (referential or informing skills), (4) interpreting affect, and (5) problem 
solving. For each task a group of ten or more people work in dyads. After each task, they 
rotate to a new partner, i 

The test for word identification skills is a paper and pencil task in which the dyad com- 
municates so that each may correctly identify the target word. Correct responses receive one 
point, incorrect answers lose a point, and no points are scored for blank items. Each individual 
has a five second trial as speaker and five seconds as listener before rotating to a new partner. 
The tester selects words according to population characteristics. 

The sentence processing task requires speaker and listener to communicate about sentences 
and decide if they are similar or different. Each dyad considers twelve items (for a total of 
twenty seconds) prior to rotating to a new partner. The score is the number of sentences 
marked correctly. 

The giving-receiving directions task calls for the dyad to be seated in a circle facing a partner 
as they work to identify abstract figures. The drawings are on a score jsheet which is vertically 
divided into two parts. On the left side of the sheet, target pictures are presented which each 
person in the speaker role will describe. On the right, side of the shee; several sets of 
five pictures; the pictures in each set are similar. The speaker describes the target picture 
and the listener responds by naming the number of that picture on his sheet; both speaker 
and listener record this number on their papers. The member of the dyad alternate roles 
every thirty seconds. After each dyad has had a turn as both speaker and listener, they rotate 
to new partners. The dyad scores a point for eaich correctly identified picture. Pictures used 
for this activity are to be selected by the DYCOMM user to insure their appropriateness to 
the |x)pulatibh using the activity. Sample drawings are included in the description. 

The fourth task, affects, again c^Js for children to work with dyads. They discuss a work 
sheet in a way which sends an affective as well as a cognitive message. The task is.for the 
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listener to idenlily the correct affective tone; Scoring is simply the number of correct re- 
sponses. Separate scores are tallied for speake' and listener: 

The j5roblem solving task requires the dyad to (discuss rules about numbers f.J they can 
correctly identify numbers on a worksheet. An example of a rule is **The first digit times 
the third digit must be more than 10." There are four rules for each task, with two of the 
rules on the speaker's sheet and two on the listener's. The dyad works forty-five seconds 
on each item and then rotates to new partners. The dyad scores one point for each correct 
Choice. 

NORM/CRITERION DATA: No information provided: 

VALIDITY 

Predictive: No information provided: 
Goncurrent: No information provided. 

Content and Item Selection: The tester is instructed to select specific words, sentences, 
emotions, tasks, and problems appropriate to the target population. 

Construct and Other Empirical Studies: No infonnation provided. 

REtlABIfclTY 

Alternate Forms: Not applicable. 
Test-Retest: No information provided. 
Scoring: Not applicable, 
internal Consistency: No information provided. 



EVALUATIVE REACTIONS ^ 

Practicality: This test is inappropriate for large scale administration due to a lack of 
available materials. 

Validity For Specific Purposes and Populations: inadequate information to judge. 
Reiiabiiity: Inadequate information to judge. 

Overall Adequacy: This test is a poor measure of communication accuracy due to a lack 
of systematic evaluation. The test provides good instructional techniques because students 
are encouraged to talk with each other. The activities will appeal to many age levels. 
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MATERIALS REVIEVVEb: __Byers, B. H. DYCOMM: Dyadic Commhication, Honolulu, 
HI: University of Hawaii, 1973. 

i 

etHER REPERENGES: None; 
REVIEWER: Janice Patterson 



S3 



ii. The Fullerton Language Test for Adolescents 



AGE RANGE: Ages I l-IS^-^normal and speech impaired. 



SKILLS TESTED: _ Speaking 



X Listening 

— Interaction 

— Visual Encoding 

Visual Decoding .... 

X Subskill or Attitude: auditory^ynthesis^ morphological 
competence, homonyms, enumerating rhembers of classes, 
syllabification, understandijig idioms, grammatical judgments 



TIME REQUIRED FOR ADMINISTRATION: Forty-five minutes per subject. 

bESeftlPTlON OF TEST, PROCEDURES, ITEMS, SCORING: Subtest 3 relates to 
listening and requires the student to receive, retain, interpret, and demonstrate an under- 
standing of oral commands. The entire test js individually administered. The twenty items 
range along a dimension of increasingly complex syntactic constructions and logical bpKsr- 
ations. Commands concern manipulations of colored geometric shapes, included in the testing 
kit. Responses are scored dichotomously. 



NdRM/CRiTERION DATA: The test was normed with 762 subjects aged 11-18 from 
regular classrooms in ealifomia and Oregon. Based on this sample, a **competence level," 
'instructional level," and frustration level" is defined in terms of standard deviations from 
the mean on each subtest. 

VALiDi ry 

Predictive: No information provided. / 

Concurrent: The scores on the test distinguished between a normal population (N = 489) 
and a special education population (N = 73): 

Content and Item Selection: Content validity was established by theoretical rationale and 
by comparison with similar instruments. 

Construct and Other Empirical Studies: The test discriminates between normal and 
special education populations. Correlations between oral commands and other subtests 
ranged from .37 to .55. 



COST: Not specified: 
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REbiABibirv 



Alternate Forms: Not appMcabie: ; 

1 

test-Retest: the icst-rclcsl cbrrelatibhs cxceedeci .80 fbi* ail subtests. i 
Scoring: Not applicable. 

Internal Gonsistency: The average inier-iieni correlation? exceeded :70 for all sobiests; 



EVALUATIVE REACTIONS 

_ _ \ 

Practicality: The oral commands subtest is easily scored, administered, and is not too time 
consuming, however, it requires individual administration. The entire FuHerton battery 
requires ibrty-five minutes. 

Vaiidlty For Specific Purposes and Populations: The test appears useful for distinguishing 
between normal and abnormal language developmehl among adolescents. Little jUstifi- 
catibri is given for interpretations of te.st scores (e,g., cbrhp^cterice level). 

Reliability: The reliability of the test is quite adequate, assuming trained administrators 
are consistent in conducting the test; 



Overall Adequacy: The test assesses only a limited type of listening ability using contrived 
and artificial speech stimuli. 

MATERIALS REVIEWED: Thorum, A. R. The FuJlerton Lan^ua^e Test for Adolescents 
(Professional Manual). Palo Alto, CA: Consulting Psychologists Press, 1980. 



OTHER REFERENCES: None. 



REVIEWER: Don Rubin 

\ 
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12. Fundameiita^ Aehievement Series, Verbal 



AilE RANGE: Aciolescenis and adults, with limited educational opportunities; 



SKILLS tESTEb: _ Speaking 
X Listening 

Interaction 

Visual Encoding 

— Visual Dccbdirig 
Subskill or Attitude 



COST: No iriformatibn provided. 



TIME REQUIRED FOR ADMINISTRATION: Thirty minutes. 

DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The verbal test mea- 
sures a variety of language skills needed for employment, mcluding vocabulary, reading 
comprehension, listening comprehension study skills, copying, and spelling^ The test has 
two forms (i^afid Bj. The entire test is presented orally but only ten items directly measure 
listening comprehension. These items are literal comprehension questions abou^ three brief 
announcements. 

NORM/CRITERION DATA: The test was normed with groups of between 100 and 200 
individuals. Groups included whites and blacks in grades 6, 8, 16, and 12 and numerous 
industrial and anti-poverty program groups. 



VALIDITY 

Predictive: No information provided. 

Concurrent: A number of small concurrent validity studies using criteria such as super- 
visors', researchers', and counselors' ratings indicated statistically significant correlations 
with the verbal lest: \ 

"Content arid Item Selection: No information provided. 

Construct and Other Emjliricai Studies: The verbal test correlated with various tests of 
general mental ability .36 to .94 indicating that the-, ist is operating In the same general 
area of measurement. 
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RELIABILITY 



Alternate Forms: In one study of ihirty-niric anii-pbvcrty prbgram participants^ the verbal 
test forms A and B correlated .74 with two weeks iritervenirig. 

Test-Rctest: Scores from two administrations correlated :62 to .95 for five groups of 
industrial and anti-poverty program participants with three months intervening: However* 
in all studies the means iilcreascd over time. 

Scoring: Not applicable. 

internal Consistency: Average inter-item correlations for the verbal test ranged from .70 
to .96 for various groups; 



EVALUATIVE REACTIONS 

Practicality: The test is group administered. The entire lest is presented on a tape recording, 
making it very easy to administer. 

Validity For Specific Purposes and Populations: The test is in an experimental stage. 
However* numerous small validity studies indicate that it is a good measure of basic skills 
arid that it works well with disadvantaged populations. 

Reliability: Evidence indicates reliability for the total test is good. 

Overall Adequacy: The test only measures a small domain of listening skills and this 
section would not stand on its own as a unique measure: 



MATERIALS REVIEWED: FymddmenrdI Aduex^^iueut Scries, New York: The Psycholog- 
ical Corporation* 1964. 



OTHER REFERENCES: None 



REVIEWER: Nancy Mead 
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13. Gary, Indiana Oral Profieieney Examination 



AGE RANGE: Grade 10. 

SKILLS TESTED: X Speaking 

Listening 

InleraclLbn 

— Visual Encoding 

— Visual Decoding 
Subskili or Attitude 



COST: No materials required; only record keeping costs. 

TIME REQUIRED F^OR ADMINISTRATION: Variable, probably no more than five min- 
utes per student. 

DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: Two weeks prior to 
testing students select one of four test formats: ( I ) the interview format, (2) the topic format, 
(3) the question format, or (4) the prepared format. The interview format may pertain to 
personal interests, education, or biographical data and includes such questions as, *'E)o you 
plan to marry? If so, specify.'' and ''Where was your father bom?" The topic format includes 
five subjects such as, ''How can a person help to improve his or her school, community, 
or country?", one of which students are assigned at the time of testing. In the question 
format, students may choose to answer one of five queries such as, "Why are 'rules' made 
in our homes, schools, or country?" For the prepared format, students prepare an brigirial 
two-minute speech about a subject of their choice, and deliver it "without notes or crutches." 
Multiple raters use a holistic scale with four-interval items ranging from "::.:verely deficient" 
to "m(xJerate to high proficiency." The scoring categories are: ( i H articulation, (2j pronunciation, 
(3) verbal Utterances (e.g., "you know''), (4) rate, (5) standard word usage, (6) voice qual- 
ity, and (7) volume. 

NORM/CRITERION DATA: No information provided. However, raters were apparently 
trained by means of samples which exemplify each score/level. 



VALIDITY 

Predictive: No information provided. 
Concurrent: No information provided. 

Content and Item Selection: The criteria directly reflect the program's oral proficiency 
perrormancc objectives. 
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Construct and Other Empirical Studies: No information provided. 



RELIABILITY 
Alternate Forms: Not applicable. 
Test-Retest: No information provided: 

Scoring: No information provided. However, ratiers are apparently checked freqUehtiy for 
wide di.screpartcie.s. 

Internal Cdhsistehcy: No information provided. 

EVALiJAl IVE REACTIONS 

Practicality: Te.sting .should progre.s.s rapidly, especially .since .students have opportunity 
to prepare long in advance of te.sting. rnierview format.s allow little opportunity for spon- 
taneous elaboration or interaction which might be time consuming. 

Vaifdity For Specific Purposes and Populations: Criteria are well suited to objectives, 
pan^culady as they are realized in forn.al, pressured contexts: Criteria seem to be biased 
against speakers of nonstandard dialects. 

Reliability* The hiajbr jDroblerri i.s the unknown equivalence between formats arid topics. 
Sbriic prbvisiori .seems to have bceri made for establishing and n:aintairiirig rater reliability. 

Overall Adequacy: The contexts arc highly atlificial and nonmotivating, despite choice 
accorded to :<'adents: Criteria reflect a narrow range of competencies re.stricted to elements 
of elocution: 



MATERIALS RE'vTEWED:__Gary Cbmmuriity School Cbrpbratibn. Oral Praficiency Pro- 
^ram. Gary, IN: iuthor, 1977-1978. 



OTHER REFERENCES: Norte: 



REVIEWER: Don P *-in 




/ 
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14. Glynn County §peeefi Profieiency 
Exan .ation 



A6E RANGE: Secondary. 

SKILLS TESTED: JL Speaking 

Listening 

Interaction 

Visual Encddirig 

— Visual Decoding 
_ Subskill or Attitude 



COST: No commercial materials needed. 

TIME REQUIRED FOR ADMINISTRATION: Approximately brie hour per fwerity stu- 
dents. • 



DESCRIPTION OF TEST, PROCEDURES* ITEMS, SCORING: Using scripted instruc- 
tions, the test .Administrator sets up a situation that simulates a public hearirig before a county 
board of education. Students are provided with an agenda that iricludes some backgrburid 
information about three selected issues. Students present persuasive speeches individually. 
Two raters score either videotapes or live performances with discrepancies resolved by a 
third rater. Rating scales specify four skill level indicators for each of the followi^ng dimen- 
sions: (1) iritrbductibri^ (2) purpose, (3) reasons, (4) organization, (5) objections, (6j conclusion, 
(7) language style, (8) oral expressibri, arid (9) gestures. 

NORM/CRITERION DATA: No information provided. 



VALIDITY 

Predictive: No informatibri provided. 

eoncurrent: For a sample of thirty ninth graders, scores bri this test cbrrelated .70 with 
a parallel form of the test which involved performance on a job interview task rated by 
slightly diffeient criteria: 

Content and. Item Selection: The task and rating criteria conform to locally stated objec- 



tives. 
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Construct and Other Empirical Studies: Ratings of this test correlated :88 with classroom 
teachers' judgments of students' typical commanication competence; There was also a 
highly significant relationship between passing performance on this instrument and stu- 
dents' placement in ability level tracks by the schooi system. 



RELIABILITY 
Alternate Forms: Not applicai^le. 
Test-Retest: No intbrmaticn provided: 

Scoring: Intcr-rater reliability was .82 when Using videotaped performances and ,72 when 
scoring live pcfforrhahces. When cbrisideririg passing versus nbnpassing scores, 15 percent 
of the students were cross-classified by two raters and required a third rating for resolution: 

Internal Consistency: Average inter-itcm correlations ranged from :82 to .88. 



EVALUATIVE REACTIONS 

Practicality: The test is relatively time consuming in both administration and rating. The 
authors estimate .2 personnel hours per student. Also the test consumes a large arnount 
of ''down time" for students. 

Validity For Specific Purposes and Populations: While rating criteria appear well tailored 
to the speaking task, this nicasM;e samples only a limited range of speaking cbrhpetericies. 
The cultural bias of the test is Urikhbwri. 

Reliability: Raters rriust be trained to achieve reliability and a standardized regimen of 
training would need to be developed. Topic and speaking order effects appear to be 
insignificant. Tcst-rctest reliability is unknown, bat likely a troublesome point: 

Overall Adequacy: The test represents strong effort at speech perforrriahce assessrriehl. 
The measure attempts to create a sense of context. However, a single speech sample 
repfesehtihg jUst one cdrhrriUriicatibri situation is riot represeritative bf general speaking 
skills. 



MATERIALS REVIEWED: Rubin, D: h:, and Bazzle, R: E: De\%*iopmem of an Orai 
Comminxicunon Assvssfiveni Program: The Giynn Coimty Speech Profideyicy ExamiyniUoh 
for Hii^h School Smdems. Brunswick, GA: Glynn County School, 1981. 



OTHER REFERENCES: None. 



REVIEWER: Nancy Mead 
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IS. Language Assessment 




AGE RANGE: Grades 1-5; 



SKILLS TESTED: 



Speaking 



X Listening 

Iritefaction 

Visual Ericdding 

^ Visual Decoding 
Subskill or Attitude 



COST: $56.50. 



TIME REQUIRED FOR ADMINISTRATION: Not specified. 



DESeRIPTION OF TEST, PROCEDURES, ITEMS, SeORING: The test includes five 
sobscalcs assessing linguistic jjrbficiency in either Spanish or English. The phonemic dis- 
crimination subscale has thirty items. Subjects determine if two words in which a phoneme 
or allophone is embedded wmd the same or different. The phoneme production subscale 
has thirty-six items. Subjects imitate words or short sentences in which a sound is embedded. 
Both are scored right or wrong: One lexical ability subscale. with twenty items, has children 
identify words for objects presented in pictures: The oral production subscale has children 
orally retell a story that is cued_with pictures. It is scored whh a five-point rating scale. 
Finaiiy, vocabular}' is assessed. Procedures for this subscale are unclear in the manual. 



NORM/CRITERION DATA: No information provided. 



VALIDITY 

Predictive: The test predicted language achievement better than cognitive styleVdev^^^^^^ 
mcnt variables: The test accounted for 40 percent of the variance in language achievement 
scores. 

Concurrent: No information provided: 

Content and Item Selection: No information provided: 

Construct and Other Empirical Studies: In a factor analysis the subscale^ of the test 
were contributed to the same factor. 
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REfciABIfclTY 



Alternate Forms: Nbl applicable. 

Test-Uetest: Scores for two aclriiiriistrations with about one week intervening correlated 
.88 in English and .97 in Spanish. 

Scoring: The inter-rater reliability, where Used, is very high. 

Internal Consistency: The iriicrrial consistency reliability is quite high: 

EVALUATIVE REACTIONS 
Praciicality: The test is useful for biiinguai progiams. 

Validity For Specific Purposes and Populations: Inadequate information to judge 
Reliability: Kvidcncc indicates test reliability is very good: 

Overall Adequacy: The test is narrow in what it assesses. It is very adequate for the 
dimensions it does assess: 

MATERIALS REVIEWED: De^^'vila. E. A.; Ulibarri. D. M.; Durcan. S: E:; Fleming, 
J. S.: Costa. M.; Perry J.: and Wainwright. C. Predicting the :Academic Success of Language 
Minority Students from Developmental. Cognitive Style, Lingal.stic J.nd Teacher Pefceptibri 
Measures. 1979. 

OTHER REFERENCES: None. 

REVIEWER: John Daly 
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16. Language Dominanee Sur^^ 

AGE RANGE: Kindergarten-grade 12. 



SKILLS TESTED: X Speaking 
Jt. Listeniring 

Interaction 

Visual Encoding 

Visual Decoding 

Subskill of Attitude 



COST: Not specified: 

TIME REQUIRED FOR ADMINISTRATION: Not specified. 

DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test measures stu- 
dents' abilities to: (I) understand and comply with stmed commands and (2) commu^^ 
with acceptable morphology and syntax. It is designed to identify children who need biimgual 
education. Consequently, both English and Spanish is used in the test. 

NORM/CRITERION DATA: The manual indic^^tes that students scoring hss than 50 per .:nt 
correct should be placed in a bilingual program. Justification for this recommendation is riot 
provided in the report. 



VALIDITY 

Predictive: No information provided. 

Concurrent- No information provided. 

Content and Item Selection: No information provided. 

Construct and Other Empirical Studies: No information provided. 



REMABILITY 

Alternate Forms: Not applicable. 
Test-Retest: No iriforrriatidri provided. 
Scoring: No information provided. 
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EVAfctJATIVE REACTieNS 



Practicality: This test Is practical for identifying students who should be placed in bilingual 
classes. 

Validity For Specific Purposes arid Populations: Inadequate information to Judge. 
Reliability: Inadequate information to judge: 



Overall Adequacy: The test is limited to very basic language skills. 



MATERIALS REVIEWED: Burt, M, K., and Dulay. H. C, Lan^ua^e Dominance Survey. 
Berkeley, CA: BCDEL/Lau Multilingual Center, 1974-1975. 



OTHER REFERENCES: None 



REVIEWER: John Daly 



11. Language Faeility Test 



AGE RANGE: Ages 3-15, normal populations;: 



SKILLS TESTED: Speaking 

Listening 

Interaction 

Visual Encoding 

Visual Decoding 

Subskill or Attitude 



COST: S20.00. 

TIME REQUIRED FOR ADMINISf RATION: Approximately ten minutes: 

DESCRIPTION OF TEST, PROCEDIJRES, ITEMS, SCORING: The child is given three 
pictures, one at a time, and asked to teil a story about each one. Each response is scored on 
a 0-9 scale with 0 being no response, I being a one word response^ 2 being a multiple word 
response, 3 beihia complete sentence, and so forth. The highest response, 9, i^s an organized, 
complete story. The student is Jiveh several probes if necessary. The score is the sum of 
the ratings on th&4hree pictures. Three_altcrnate forms of pictures are available— phr lOgraphs, 
line drawings, and reproductions of Spanish art masterwbrks. The test may be given in the 
student's native language, sign language, or English. Success is based bri elabbratibh of 
language, not on standard grammar or vocabulary. 

NORM/CRITERION DATA: The test was nomied on 4000 students ages 3 to 26. Smaller 
studies were conducted bn speciihgrbups, including low achievers, mentally retarded, hand- 
icapped, and rural Spanish speakers. 

VALIDITY ^ 

Predictive: No infbrniation provided. 

Concurrent: No ihformatioh prbvided. 

Content and Uein Selection: No informatibh prbvided. 

Construct and Other Empirical Studies: The test correlated slightly with intelligence, 
reading readiness, achievement, and teacher ratings of scholastic performance. 



; 1 



60 




RELIABlblFY 

Alternate Forms: The forms correlated .46 to .90 (with intervening instractional activity). 
Test-Relesl: See above. 

Scoring: Inter-rater correiations ranged from ,88 to .94 across three pictures. ^ 
Internal Consistency: No ihfdrmaiibn provided. 

EVALUATIVE REACTIONS 

Practicality: The test requires on administratibri arid trained scorers. The test is 

straight forward and does not take long to admiriisier. 

Validity For Specific Purposes and Populations: Evidence indicates that test validity is 
adequate. 

Reliability: Evidence indicates that test rcliabiiity is adequate. 

Overall A.deqnacy: The test focuses or lartgUage arid cognitive development. It does not 
measure functional commUriicatibri cbmpeterice, 

MATERIALS REVIEWED: Lan^ua^e FaciUty Tesr Alexandria, VA: Allingtbri Cbrpora- 
tibri. 

OTHER REFERENCES: None: 
REVIEWER: Nancy Mead 
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18. Language SkUls Gemmunication Task 



AGE RANGE: Kinclergarteh-grade 2. 



SKILLS TFSfEb: X. Speaking 
X Listening 
X Interaction 
X. Visual Encoding 
X. Visual Decoding 
Subskill or Attitude 



COST: User produces materials and scores. 

TIME REQUIRED F6R ADMlNISTRAf ION: No time limit: 



DESeRlPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test is composed 
of two tasks designed to assess the child's ability to get meaning and ideas from conversation, 
respond to the language of others; transmit ideas to others, and adapt his or her conversation 
to achieve effective communication. These are referential or informing tasks. In each instance, 
two children are seated on opposite sides of a picture board which is the same on both side^. 
One child is designated speaker and the other listener. The task is for the speaker to tell the 
listener where td place the pictures scattered loosely in front of the listener bn the board^ 
The goal is for the listener's picture to match the speaker's. The children are not permitted 
to see each other's boards or tb make gestures: The listener can ask for more information: 
The communication accuracy is scored both individually and as a dyad: The speaker receives 
one point in each of three areas by supplying specific verbaLinstructHons for: (1) object 
identification. (2) object placement, and (3) object positioning. The listener s score is de- 
termined by (1) selection of correct object. (2) placement of the object, and (3) questioning 
when insafficient informatibh is givert'. 

Mean scores are calculated for each task rS assess dyadic cbmmunication. This score is based 
on the average of the six subscores eam?d by speaker and listener. The listener's success m 
placing the objpct correctly is also a measure of the dyad's communication effectiveness. 
Children's responses are recorded; these data are used for analysis, 



NORW/CRITERION DATA: No information prbVided. 



■ B8 ■ . 
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VALIDITY 



Predictive: Ghildren^s performahce oh individual subtests predicted their success in placing 
the object correctly. Other results indicated that correct object waj, correlated significantly 
with grade (.32), math achievement (:43), and reading achievement (.37): Sex and intel- 
ligence were not significantly correlated with object placement; 

Concurrent: No ir.formation provided. 

Content and Item Selection: No infonm;^tion provided. 

Construct and Othier Empirical Studies: No information provided: 

RELIABILITY 
Alternate Forms: Not applicable: 

Test-Relest: Twelve first grade children were given the test twice, a week apart: Children 
were raridbrrily paired arid randomly assigned to play the same role for both sessions. The 
mean perccritge of agreerrierit for the task was 89.3 percent, with a range from 78.5 percent 
to loo percent. 

Scoring: No information provided: 

Internal Consistency: Split-half correlation was :73 for the first task and :76 for the_seeond. 
The scores of twelve firs, graders were used to determine this measure <of internal con- 
sistency. 

EVALUATIVE REACTIONS 

Fracticaiity: The test is inappropriate for large scale admmislralion due to testing proce- 
dure, i.e., assessing children in pairs. No special training is necessary for test adminis- 
tratibri. 

Validity For Specific Purposes arid Pbpulatibris: By the author's own admission, more 
work must be completed to establish validity of the test. ^ 

Rdlabiiity: Current reliability measures are inadequate. 

OveraSi Adequacy: fjivcn ihc simHr-inty in the test design to other measures of referenliaj 
accuracy, tcsi iriay prb^ useful 'ala. However, more rigorous, systematic evaluation 
is p' cdeJ before test uti.ry r:-.n as.^Ured of adequate validity and reliability in the 
instnimcrnt. 
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MATERIALS REVIEWED: Wang, M. C; Rose, S.; and MaxweM; d: thepevejopment of 
the Language Cbmmuhix:atibn Skills Test, PiUsblifgh: The University of Pittsburgh; beaming 
Research and Development Center, 1973. 



OTHER REFERENCEiS: Dickson; W: P:, and Patterson, J. H. Communication Education, 
1979. , 



REVIEWER: Janice Patterson \ _ 




19. Listehing Comprehensien Tests 

A6E RANGE: Battery^. ages 10^1 1; Battery B, ages 13-14; Battery C, ages 17-18; 

SKILLS TESTED: _ Speaking 
X Listening 

Ihleractjdh 

— Visual Encoding 
J — Visual Decoding 

Subskill or Attitude 



< OST: Appnuimaiely $75.00. 

rmE RKOLIRED FOR ADMlNlSTRATJON: Approxim^^.tcHy two ihirty-rriinute sessions, 
Vv':h w 'Mon break. 

bl StRli^ION OF lESl, PRGCEDUKES, ITEMS, SCORING: The test includes five 
diffcrciit subtests; { I ) content— ^^asi .: comprehension, (2) contextual constraints — infer miss- 
ing p-Mrts of conversation, (/ ) phonology — understand differences in meaning brought about 
by different inflections, cic;, (^j register— detect inappropriate uses of language, and 
(5) reUuionships — detect kinds of ;t;Jtsonsh!ps existing between people from language used. 
Students listcri to tape recorded somuli and answer maltiple choice questions about what 
they hoard. 

NORM/eRITERION DATA: The test was normed oh a sample of 1,152 individuals. 



VALIDITY 

Predictive: No informatibn provided. 
Concarrent: No information provided. 

Content and Item Selection: The test design was based on a theoretical description of 
listening. 

- * 

Cohstruct and Other Empificai Studies: The test correlated from .45 to .60 with Simplex 
Junior Intelligence Scale and AH4 Group Test of Genera! Intejligence^The test con-elated 
from .41 to .75 with Schonell Silent Reading Test, Secondary Reading Test, and Senior 
Reading Test, 
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RiiblABifciTY 



Alternate FDrtns: Not applicable. 
Test-Retest: No information provided. 
Scoring: Not applicable. 

internal Consistency: Average inter-item correlations ranged from :78 to :84: 

EVALUATIVE REACTIONS 

Practicality: The test is very easy to administer in group. Instructions and ite>ns are all 
provided on a tape recording: 

Validity For Spec»^'^ Purposes and Populations: Evidence indicates tha; test validity is 
good. 

Reliability: Evidence indicates that test reliability is good. 

Overall Adequacy: British accents would be difficult fo: American children: A larger 
problem is the fact that words and topics would be upfamiiiar: 

MATERIALS REVIEWED: Wilkinson. A., Str?.(ta, L.; and Dudley, P. Listening Coni- 
prehemion Tests, bondon: Macmillan Educatic.i. Ltd., 1974. 

OTHER REFERENCES: Wilkinson, A.; Sr-atta, L;; and Dudley, P: The QauUty oftis- 
tehing, London: Macmiiian Education, Ltd., i974. 

REVIEWER: Nancy Mead 
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20. MACOSA Listening and Speaking Tests 



AGE RANGE: Grades 3, 6, 9, and 12. 



SKILLS TESTED: X Speakir g 



JL Listening 

Interact .dri 

Vjsua! Ericbdirig 

Visual Decoding 

Sub'jfcill or Attitude 



COST; N<^r ccnimerciaHv available. 



TIME REQUIRED FOR ADMINISTRATION: Listening, sixty to seventy minutes; Speak- 
ing (written), forty minutes; and Speaking (oral), thirty-five to sixty-two minutes for groups 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: For the listening tests, 
students listen lb tape recordings bf various formal arid irlcrmal listeriirig material arid ariswer 
multiple-choice questions about what they heard. Questions measure voice production factors, 
linguistic factors (words and senterices), and organizational factor (literal, interpretative, 
and critical comprehensiori). 

The speaking tests include both writteri and oral cbriipbrierits. The writteri tests include 
multiple-choice questions which measure knowledge bf how voice production, nonverbal, 
arid linguistic factors can convey meaning and (at grades 9 and 12) knowledge of speech 
organization. The oral tests assess articulation (at grade 3 only), spontaneous speech, and 
spontaneous-prepared speech (at grades 9 and 12); Third grade students are asked to name 
objects in pictures for the articuiation test. Students are asked to talk about a picture or a 
scrambled outline. Students are asked to read albUd. Their responses are rated albrig five- 
point rating scales. 

Altitudes were measured using McCrosky's Personal Report of Communication Fear (grades 
3 and 6) and Personal Report of Communication Anxiety (grades 9 and 12). 



NORM/CRITERION DATA: The instruments were field tested on 168 to 25! students per 
grade. 



ot* six. 
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VAblDiTY 

Predictive: No Information provided. 

Concurrent: No information provided. I 

Content arid Item Selection: Content was based on province developed objeciives_| De- 
velopers used standard procedures to identify difficulty and discrimination power of| mul- 
tiple choice items. / 

Construct and Other Empirical Studies: Factor analysis was used to substantiate structure 
of content domain. Results were used to revise the tests slightly. | 

\ 

RELIABILITY 



Alternate Forms: Not appiicabfe. 
Test-Retest: No information provided. 

Sconng: Inter-rater reliability ranged from .75 to :83 using a 0-1 rating scale during pilot 
test: 

Internal Consistency: Average inter-item cbrrelatidhs for listening tests ranged from .43 
to .65. Average inter-item correlations for speaking (written) tests ranged from 39 to .64. 



EVALUATIVE REACTIONS 

Practicality: Written tests are very easy to administer. The oral speaking test requires small 
group administration and trained raters. 

Validity For Specific Purposes and Populatioris: Evidence indicates that validity is ad- 
equate. 

Reliability: Evidence hiaicales test reliability is fair: 

Overall Adequacy: The tests cover a very broad range of skills: The tests need further 
development of inter-rater reliability. 



MATERIALS REVIEWED: Plattor, E:; Unruh, W. R.; Muir, L,; and Loose, K. D. Tesj 
Development for Asses^m Achievemeiw in thtening and Speaking, Edmonton, Alberta: 
Alberta Education, 1978. 



OTHER REFERENCES: None. 



REVIEWER: Nancy Nkad 



21. Massachusetts Assessment of Basic Skills 
Listening Test 



AGE RANGE: Grades 7~ 12: 



SKILLS TESTED: ^ Speaking 
i Listening 
— Interaction 
_ Visual Encoding 

Vistiai Decoding 

Subskill or Attitude 



COST: Not commercially available. 

TIME REQUIRED FOR ADMINISTRATION: Approximately thirty minutes. 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: This test measures eleven 
basic listening skills that encompass understanding and using what is heard; It includes six 
stimuli: a new study, a commercial, a telephone conversation, a teacher's announcement, a 
public service announcement, and a conversation that takes place during an emergency. Each 
passage is brief, uses simple vocabulary and reflects common listening experiences. The test 
is composed of a total of twenty-two multiple-choice items. AM rhaterials Ihcludihg ihstinc- 
tions and all response dptidhs are tape recorded. 

NORM/CRITERION DATA: Determination of mastery level perfonnance is not stated: A 
statewide survey was conducted involving 2,267 students from forty-nine schools in Mas- 
sachusetts: 



VALIDITY 
Predictive: No information provided. 
Concurrent: No information provided: 



Content and Item Selection: Items were reviewed by a panel df judges. Iterh difficulties 
and discrimination are reported. 

Construct aiid Other Empirical Studies: Field testing suggested that the test is not biased 
with respect to ethnic minorities. 
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RELIABILITY 



Alternate Forms: Four allemale forms were developed. Difficulty levels are approximaiely 
equal. Mastery certification decisions (for assumed cut-off scores) agreed 85-90 percent. 

T^st-Retest: No ihfbrmatibri provided. 

Scoring: Not appHcablc. 

Iiiteriiai Consistency: Average inter *^em correlations ranged from .64 to .86 for the four 



EVALtJATIVE REAGTIONS 

Practlcaiity: The test is easily administered, particularly since all materials are tape re- 
corded. Test booklets are well formatted. 

Validity For Specific Purposes aiid Populations: The test appears to be highly vajid for 
stated objectives, with the exception that conversational listening is only obliquely re- 
flected: The test is superior lb mo.st listening tests for assessment of iife-roie minimum 
competencies: 

Reliability: The test is relatively short (twenty-two items), arid this may limit reliability. 
A major advantage is the availability of equivalent forms for purpMDses bf retestirig those 
who are remediated following unsatisfactory initial testing. 

Overall Adequacy: i he test sariipies a variety of important listening situations and skills. 
It Is not confounded with reading ability. The only significant drawback is the failure to 
test listening in an interactive context. 

MATERIALS REVIEWED: Massachusetts Department of Education, Bureau bf Research 
and Assessment. Massachusem Assessmem ofBmiv SkiHs 1979'80 Developmen! Report: 
tistenin^/Spedkin^. Bbstbri: authbr. 1980. 

OTHER REFERENCES: Massachusetts Departmerit of Education, Assessmeni oftistening 
SkiUs State Test (Secondary tevel). Boston: author, 1979. 

REVIEWER: Dbri Rubin 
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22. Massachusetts Assessment of Basic Skills 
Speaking Test 



AGE RANGE: Grades 7-i2. 



SKILLS TESTED: ^ Speaking 
— Listening 

interaction 

Visual Encoding 

Vis.ual Decoding 

SUbskill or Attitude 



edST: No materiais; only record keeping costs. 



TIME REQUIRED FOR ADMINISTRATION: Initial .ss.\ec:ning based on typical perfor- 
mance. Ohe-bh-ohe assessment less than twenty minutes per student. 

DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: For initial screening, 
two classroom teachers rate a student's typical communication behavior along fdUrdimerisdns: 
(I) delivery, (2) brgariizatibri, (3J cdriterit, and (4) language. Each criteribri is rated on a 
four-interval scale ranging from inadequate to superior. Those who do hot pass the initial 
screening engage in four communication tasks with a single administrator/rater. The tasks 
are describing an activity, simulating an emergency telephone call, explaining a cooking 
procedure, and simulatiri^a persuasive conversation with a school principal: The rater scores 
students' performances ot the four tasks in the manner used for initial screening. 

NORM/CRITERION DATA: The manner of deterrnihihg mastei^ and perfcrriiarice level 
was not explained. A statewide survey was conducted involving 691 students in forty-nine 
schools in Massachusetts. 

VALIDITY 
Predictive: No information provided; 

Concurrent: Scores derived frbrri initial teacher screenings were compared with secbnd 
phase cdmrriuriicatibh task performance scores. Means for the twb types bf assessment 
were equal. Ratings on individual criteria were almost always within one point of each 
other. 

Content and Item Seiection: Criteria and tasks were reviewed by a panel of experts: 
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Construct and Other Eitiplr icai Studies: Survey data indicate some possibility of racial/ 
ethnic bias. 



RELIABILITY 

Alternate Forms: Four forms of the brie-on-one communication tasks were developed^ AH 
fonns dispiayed nearly equal means. For probable cut-off scores, mastery certificacion 
decisions agreed over 90 percent, except for a narrow riiiddle range of scores. 

Test-Retest: No information provided. 

Scoring: Approximately 95 percent of teacher ratings of typical communieation were either 
equal or adjacent. Language arts arid cbriterit area teachers did npl differ: A sabsample of 
one-on-one ratings were rescored with 75 percerit identical rating. Some evidence of 
ditTercntial leniency emerged. 

Internal Consistency: No information provided. 



EVALUATIVE REACTIONS ^ 

Practicality: initial teacher screening considerably reduces the burden of rribre focused 
assessment. The use of a single rater/administrator in second phase testing also limits 
personnel demands. 

Validity For Specific Purposes and Populations: The test samples varied commanication 
contexts. It includes naturalistic observation. It uses broad evaluation criteria that encom- 
pass a broad range of skiils: The question of: racial/ethnic bias is Uridergoing additional 
iriquiry. 

Reliability: Teacher expectations are likely to play a role In initial screening, perhaps 
contaminating ratings with gerieral ability. The use of only one rater in second phase also 
seems problematic. This issue is uridergbirig additional inquiry: Equivalent forms of the 
communication tasks are a major advantage. 

bverali Adequacy: There is no guarantee that teacher ratirigs dp, in fact, reflect only 
cbmriiuriication skills. Rubrics for criteria Identify functional skills, but criteria names 
appear formal arid absolute. The one-on-one communication tasks are fairly artificial. 

_. ^ 

MATERIALS REVIEWED: Massachusetts Department of Education, Bureau of Research 
and Assessment: Massachusetts Assessment af Basic Skills 1979-mO Deveiopment Report: 
Ustening, Speaking: Boston: author, 19Sn. • 

OtHEk REFERENCES: Massachusetts Departrtierit of Education: Assessment of Speaking 
Skills State Test (Secondary Level), Boston: author, 1979. 



REVIEWER: Don Rubin 



23. Measure of Cepimunieation Competence 



AGE RANGE: Ages IV2 to 4. 



SKILLS TESTED: i Speaking 
— Listening 
z_ Interaction 

Visual Encoding ' 

- Visual Decoding ' 

Subskill Or Attitude 



COST: Not commercially available. 

TIME REQUIRED FOR ADMINISTRATION: Appr^niately fifteen minutes. 

DESCRIPTION OF TEST, PReCEDURES, ITEMS, SCORING: Children are given a 
series of rourteen probes which form an informal interview. Responses are judged as ap- 
propriate or inappjopriate: The probes and scoring guides are provided. There are two probes 
for-each of seven modes or functions of communication. The n>odes are (1) contactive 
(2) convereative, (3) descriptive, (4) directive, (5) explanative, (6) narrative, and (7) persuasive! . 
These modes repre.sent a developmental continuum of commnnicaiinn cotr^petehcies. 

NORM/CRITERION DATA: No information provided. 

VALIDITY 
Predictive: No infoririatiori provided. 
Concurrent: No information pirovided; ^ 
Content and Item Selection: -o information provided. 

eon^tract and Other Empirical Studies: Modes were validated bv F: Williams and R 
Naremore, On the Functional Analysis of Social Class Differences in Codes of Speecfr-- 
Speecn Monographs, 1969 36, 77-102. 
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RELIABILITY 



' Alternate Forms: Not applicable. 



Tcst-Retest: No information provided. 

Scbniig: Inter-raier agreement was 78 percent and 81 percent. 

triterrial Consistency: No information provided. 

EVALUATIVE REACtlONS 

Fracticauty: This approach is feasible bat requires one-on-bne assessment and 



Validity For Specific Purposes and Populations: Evidence indicates that test validity is 
good, 

Reliability: Evidence indicates that test reliability is good. 

Overall Adequacy: This type of assessment measures basic, functibrial communication 
competencies. It could be adapted for older children. However, it would- be harder to 
identify a continuum of higher level competencies. 

MATERIALS REVIE WED: Rlccillo^S. C. -Children's Speech and Commanicative Com- 
petence. "Unpublished doctuiil dissertation, University of Denver, 1974: 



scorers. 



OTHER REFERENCES: None. 



REVIEWER: Na^ncy Mead 




SG 



f 74 



24, Metropdlitan Aeiiievement Tests; Listening 
Compfelieiisieii 



AGE RANGE: Kindergarten-grade 4. 



SKILLS TESTED: _ Speaking 



JL Listening 
Interaction 



Visual Encoding 



: Visual Decoding 

Subsfcill bt Attitude 



COST: $16.25 per thirty-five copies. 



TIME REQUIRED FOR ADMINISTRATION: Approximately twenty to twenty five min- 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: There is a listening 
comprehension component in the language subtest in four levels of the test battery: Primer, 
Primary 1, Primary 2, arid Elementary. Each subtest has two foms: There are twenty-brie 
to thirty listeriirig items iri the language subtest. The student is asked to pick the picture, out 
of four choices, that answers a question about a sentence or several seriterices that the 
adrriiriistratbr then reads aloud. 



NORM/CRiTERIdN DATA: The test was nbrriiec* with a stratified national sample of 
students. Over 550,000 studeri'.i j^articipaled. Data was collected h the fall and spring. 



VALIDITY 

Predictive: No infomation provided. ; 
Cbricurrent: No informatiori provided. \^ 

Content and Item Selection: Test items are based on tcittbooks and curriculum objectives 
that are cbniriibrily used. Items were reviewed for bias by experts. Standard item arialysis 
were conducted. 

Cbiistriict and Other Empirical Studies: No informatidri provided. 
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RELIABILITY 



^'ternate Forms: No information provided; 
Te:4-Retest: No infonhalidn prbvided. 
Scoring: Not applicable: 

Internal Consistency: Average inter-item correlations are available for the language subtest 
as a whole, which includes the listening comprehension component. For example, the 
reliability of the language subtest at grade 6 was .92. 

EVALUATIVE REACTIONS 

Praciicality: the test is group administered: The test requires that the administrator dictate 
the items but the directions are very clear. 

Validity For Specific Purposes and PbpUlatidris: Aithough little validity related evidence 
is provided, the test covers standard areas of the school curriculum. 

Reliability: The overall battery and major subtests are highly reliable. No evidence is given 
regarding the listening comprehension component: 

Overall Adequacy: The test covers the b^isic elem* ?.ts of listening comprehension. How- 
ever, it does not represent a breadth of listening material. 

MATERIALS REVIEWED? Prescott, G: A:; Balow, I. H.; Hogan, T. P.; ari§ Farr, R. C. 
Metropolitan Achievement Tests, New York: The Psychological Corporation, 1978. 

OTHER R EFERENf: " None ! 
REVIEWER: Nancy Mead 



82 



25. Michigan Educational Assessment Program; 
Listening Test 



AGE RANGE: Grades 4, 1. jnU !(). 

SKILLS f ESTF:r^. _ Speaking 
iX: Listening 

Interaction 

Visual Hncoding 

Visual Decoding 

Subskill or Attitude 



COST: Not specified: 

TIME REQUIRED FOR ADMINISTRATION: Approximately forty-five minutes: 

DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test measures sev- 
eral objectives of critical listening, including main idea, summary, purpose, recall of details, 
cause and effect, inference about the speaker or people described by the speaker, fact or 
opinion, story line or sequence. Several stories are read aloud from a tape recording. After 
each story several niultlple-chbice questibiis arid respdrise options are read aloud. The student 
has the questions and respdrise dptidns iri a test booklet and marks the best answer on an 
ariswcr sheet. 

NORM/CRITERION DATA: No information provided. 

VALIDITY 
Predictjve: No information provided. 

Concurrent: No information provided. ' 
Content and Item Selection: No Inforrriatldri provided. 
Construct and Other Empirical Studies: No information provided 

RELlABlLiTY 
Alternate Forms: Not applicable. 
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Test-Reiesi: No ihrorhiation provided: 

Scoring: N»u ;ipp!icable. 

Internal Consistency: No inibrniaiibh provided. 



EVALUATIVE REAGTIONS \^ 
Practicality: No manual is included. Tests are easy to administer. 
Validity For Specific Purposes and Populations: Inadequate intormtitiph to judge, 
Reliability: Inadequate information to judge. \^ 
Overall Adequacy: The te.st measures a variety of listening skills. \ 

MATERIALS REVIEWED: Michigan Department of Education. Lhtening Test, Grades 4. 
7, ](), booklets and scripts. Lansing: author, 1978-1979. 

OTHER REFERENCES: Michigan Department of Education. Minimal Performance Ob^ 
ji'i'tivrs: Lansing: author, 1978-1979. 

REVIEWER: Nancy Mead \ 
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2b. National Assessment ef Edueatienai Progress 
Pilot Test of Speaking and Listening 



AGE R.... Age 17. 



SKILLS TFSTED: X. Speaking 



X Listening 



Inlerattion 



; z: Visual EncoHing 



^ Visual Decoduv. 
X Subskiil or fiifs<'j: ; Dmmunication 



COST: Not com Ric re . ally available. 



TIME REQUIRED FOR ADMIN.^^' TRATION: Fifty minutes per booklet, five booklets. 
None of the booklets encompasses all of the objectives. 

DESCRIPTION OF TEST, PROeEDUkES, ITEMS, SCORiNG: The test iriciuHcs in- 
direct measures of speaking skills in the ares of informing, controlling, and cxpre .sing 
feeling:;. The test Uses a niuItiple-Lhoice format to query appropriate responses to brief 
scenarios. Multiple-chbicc rjuestioris about ritual izi rig cover both speak i rig arid listcriirig skiHs. 
Tape recorded stirriuli, instructibris, arid rriuliiple-chbice qiieslioris are used ♦b rrieasure lis- 
tening/recognizing in the intbrn.ing, cbritrblling, and expressing feelings fun' • »ris. Attitudes 
toward communication are measured by Likcrt scales. 

NORM/CRIT" UON DATA: Test charaetefistics were analyzed using 693 siudents rep- 
reseritirig a vdiiety of geographic regions, types of commuriities, arid ethriic and socioeco- 
nomic backgrburids. 

VALiDiTV 
Predictive: No information provided: 
Concurrent: No iritormatibri provided. 

Content and Item Selection: Item generating matrix and objectives were based on a theory 
of functional communication competence. Objectives anci items were developed by paneU 
of :.xpeRs and were reviewed by a minority review panel. Methr*rj of administration avoided 
contamination with reading abili-y and prior knowledge of subject matter: 




t'oiistru'. and OJher Empirical Studfes: No signiHcant correlations beiween listening 
and speaking subtests were reported Iricorisisterit correlations between subtests within 
speaking and listening were rcportec. The relationships within speuking were stronger than 
within listening. Inconsistent correlatiMns between communicatidri knowledge and altitudes 
were reported. 



RELIABILITY 

Alterriatf" Forms: Not applicable. 
Test-Retest: No iriforriiaiioh provided. 

Scoring: Scoring is objective, however, iterii analyses revealed a lack of consistent response 
patterns for some items and some distractors. 

Internal Consistency: Average inter-item correlations were .78 for informing/speaking 
items. .72 for ritualizing iterns, .78 for communication attitudes, .66 for controlling/ 
speaking, .6} for infoniiirig/iistenirig, .59 for controlling/UKtenin^ Correlations were very 
low for speaking and listening questions in the aroa of expressing f. ?'ings. There were 
no mv^asares of overall test reliability. 

EVALLfATIVE REACTIONS 

Practicaiity: The entire test battery would require lerigihy adm-^ ' 'ic..^: recorded 

instructions and multiple-choice response lurmat eases admiriistia;i..ii and sro. ii.g. 

Validity For Specific Purposes a'^d Populatfon.;: There is no evidence that the indirect 
test of speaking kilowledge nrediets speaking perforni.. ;:e. Factor analysis of communi- 
cation attitr les \> diP/icult to interpret. Some results suggest depres:>ed, though not nec- 
essarilv biased, pei/ormance for minbiif students. The content validity is high. 

Reliability: The reliability varied considerably among cohterit areas. Results of item anal- 
ysis may be used to improve both reliability and validity. 

Overall Adequacy: The test stems from a strong conceptual framework. Indirect testing 
of communication skills may be an untenable technique, however, since it is difficult to 
adequately define communicatibn contexts and associated contingencies, bistening subtests 
show promise since they utilize oral language arid are constnicted to be uncontaminated 
by extraneous factors like reading ability and prior subject hiatter knowIed;^,e. 

Materials REVIKWED: Mead,_ N. a. "The Development of an instrument for As- 
sessing Functional Communication Cpmpeterice of Seventeen-Year-C:ds/' Unpublished dis- 
sertation. University of I^enver, 1977. 

OTHhR REFERENCES None. 

REVIEWER: Don Rubin 
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27. New York State Regents Cdmprehensive 
Examination in English; Listening Sectic^i 



AGE RANGE: Grade 12 



SK1L1.S TESTED: _ Speaking 
X Listehihg 

. Inieracnbh 

— Visual Eni'o ' : 

_ Visual Dc. vi 

5ubskiii ' Auiiucle 

COST: Not coniniercially available. 

TIME REQUIRED l OR ADMINISTRATiQN: Approxiinaiely firrecn minutes. 

DESeRlPTION OF TEST, PROCEL ^ES, ITEMS, SCORING: A singjc passage is 
read aloud twice. The test includes tci uiultiple-chbice items emphasizing recall; purpose, 
and some inference. 



NORM/CRITERION DATA: No information provided: 



VALID It' 

Predb Ive: No information provided. 

Concurrent: No information provided. 

Contend and Item Seleclfon: No ihfbrmation provided. 

Construct and Other Empirkal Studies: No informaiibh provided. 



RELIABILITY 

Alternate Fdrrri.s: Not applicab» 
Tcst^Retesi: No information p.ovidvd. 
Scoring: applicable. 



Internal Consistency: No infornration provided; 



EVALUATIVE REACTIONS 

Practicality; The test is easily administered and scored. 

Validity For Specific Purposes and Populations: The test samples a restricted range of 
cdmriiuhicaiion contexts. The responses nrc confounded with reading ability: 

Reliability: A single, short passage with a ieri-iterh test is hot likely to yield reliable scores. 
Moreover, inconsistencies in administration due to reading of passage cchtribute to mea- 
surement error. 



Overall Adequacy: There is iittie basis tor inierpretln^scores as Indices _or listening 
achievement. The test may, however, have some value as a component of a comprehensive 
English test. 

MATERIALS REVIEWED: L'niversity of the State of New York, Regents Hi^^h School 
fyxiwnjuuioji: ComprefwTiyni ^'Aamm^///o« in English. Albany, NY: author, ivoG, 



OTHER F ERENCES: None. 



REVIEWER: Don Rubin 
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28. New York Statewide Achievement 
Examiiiatidii iii English 



AGE RANGE: Grade 12: 



SKILLS TESTED: X Speaking 
^ Listenin,\» 

Interaclion 

Visual Encoding 

Visual Decoding 

Subskili or Attitude 



COST: Not commercially available: 



TIME REQUIRED FOR ADMINISTRATION: Speaking: four minutes per student; lis- 
tening: fifteen miiiulcs. 

bESCRIPtiON OF TEST, PReeEDURES, ITEMS^ SCORING: For the speaking test, 
students present a three ninate monologue on one of three supplied topics: 1 . o minutes are 
allowed for preparation. Assessment takes place In class: A single rater assigns a general 
iiiipressibn rriark of () (nonperforrriance) to 10 (strong). This mark is based on (1) content, 
including sralemerit of topic, focus, and support, (2) organization, 'ncluding coherence and 
clarity, and (3) voice and articulalibh iprJudihg volurrie but excluding regidilalisrri and accent. 
For the listening test, a passage is re^iO aloud. The test includes ten multiple-choice iterris 
emphasizi\ig recall and Inference. 



NORM/CRITERION DATA: The test is intended tor non-academically track** ' .udjnts: 



VALIDITY 

Prediciive: No information provided. 

Concurrent: No information provided: 

Content ar 1 Item Selection: No irirbrmatibn provided. 



Construct and Other Empirical Studies' No informatibh provided. 



SB 

o 

ERIC 



RELIABILITY 



Alternate Forms: Not appIicaDle; 
Test-Retest: No ihforrriaibn provided. 
Scoring: Not applicable, 
internal Consistency: No information provided. 



EVALUATIVE REACTIONS 

Practicality: The test is relatively easy to administer and score. Ii i*equires little time. 

Validity For Specific Purposes and Pdpiilatidiis: The test measures a limited range of 
speaking skills. The speaking task is highiy artificial, lacking context for cbmmunicatibn^ 
The criteria seek to avoid bias against nonstandard dialect speakers, but the task may be 
inherently biased. 

ReliabilUy: A single rater, piresurriably a classroom teacher, is likely to introduce rating 
error Hcvever, a scoring guide with sample grade transcripts is supplied. Some provision 
is n::.r?e for consistency of admliiistrirtion. Topics may affect scores. 

Overall Adequacy: The icki ir not funciionaliy oriented because it lacks context For com- 
municatioh. Speaking task and criteria taps a limited range of skills: 

IVf^iTE- ^^ED: University of the State of New York. StatcvMe Achievement 

E^nr.v. r^n^Usfu Albany, NY: ; Jthor, 198C 

OTHER REFERENCES: None 



REVIEWER: Don Rubin 
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Oliphaiit Tests; Auditory Synthesizing Test 
ind Auditory Discrimination Memory Test 



SKILLS TESTED: _ Speaking 

— Listening 

— rnieraciion 

— Visual Encoding 
~ Visaai Decoding 

JL Subskill or Attitude: auditory memory 



COST: Not specified. 



TIME REQUIRED EOR ADMINISTRATION: Not specitied, 

DESCSlP^m^^ i EST, PROCEDURES, ITEMS, StORfN^^; jn the auditory syn- 
thesizing test, children arc presented with individaal sounds which they must hold in memory 
to ibrm words ct)mposcd of those sounds: In the disorimini'tion test, children hear two . words 
that sound alike. One the two words is spoken again by the test adriiiriistrator. Children 
mast select which word was spoken twice. 



NORM/CRITERION DATA: No mtbrmation provided. 

VALIDITY 

Predictive: N() information provided. 

ConrUrrent: No irilbrmatibn provided. 

Cbhteot and Item Selection: No inrormation provided: 

Construe t and Other Empirical Stuijes: N() inf()rmatioii provided. 

R^LiABH ^TV 
Alternate Forms: Not applicable, 
Test-Rctest: No information provided: 
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Scdrihg: Not appiicabjc; 

Internal Consistency: No information provided: 



EVALUATIVE REACTIONS 

Practicality: The test is useful for children suffering from auditory problems. 
Validity For Specific Purposes and Pbpiilatidhs: Inadequate information to judge 



Reliability: Inadequate information to judge. 

Overall Adequacy: The test is quite narrow in its focus; Tt e manual does not provide 
sufricii. >t instruction for the test administrator. 

MATERIALS REVIEWED: Olipfwnt Tests: Audrtbry SAvj/Zie^/zmg Test and Audirory bis- 
crimmtion Memory Test, eambridge. MA: Educators Publishinu ServiLC. 



OTHER REFERENCES: None. 



REVIEWER: John Daly 
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30. Oral Language Evaluation 

AGE RANGE: Not specified, but appears appropriate for elernentary level: 



SKILLS TESTED: Sp< Niking 
X 1 r 

-; ^ j:-; f^iicoding 
■ -.1 ai Decoding 

X Sabskiii or Attitude: assesses child's jMimary language, iden-ilics 
c'*.!;jien who heed training in English 



COST: Not specified. 



TIME RE(^;;iRED FOR ADMi^ Pan I hot specified; pari 2, two minutes 

per child: part 3, ten minutes per child. 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test is designed to 
identify, a.s.sess^ diagnose, and prescribe the oral language abi'ity of English ahd Spanish 
speaking children. Part ! identifies children who may neea training ih a second language. 
Tnis is ■isse.s.sed by teacher observation and informatioh from school records. The child is 
judged to need English as a second language (CSL) training if the primary language in the 
home is not English, if the student most often speak.s a language other than English, or if 
this child's first acquired language is riot English. Pari 2 asses.ses the child's primary jangaage 
and determirivis if ndditibrial testirig in oral language ability is necessary: The test administrator 
pre.seht.s foUr pictures arid encourages the child to discuss them: the tesi administrator use 
a ;^Six Level Lan^^uage Continuum" to determine if farther testing is r . ,.ss3',y. f^xartipit 
of responses at various levels are provided: The same procedure i.s folldwc ' *»,; A 3 >xcepl 
the child's discussion of the pictures is tape recorded. Trariscript.s of thev.^ u r^es a, ? analyzed 
using the * 'Six hevc! Language ContlnUUm." Part 4 provides instrucii^rui -acdvities for 
children at ear*< level on me contihuum. 



NbRM/CRLiERION DATA: No infomiation provided. 



VA3JDITY 



Predictive: No information provided. 

Concurrent: No infbrrrialiori oiovided. 

Cdhteiit and Lerii Selection: No infoiirii'ion provided: 
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^i^st^^lc^ and Othcfr Empiricai {Studies: No information provided. 



RELIABILITY 

Alternate Forms: Not applies'! !c. 

Test-Retest: No intcu - . rn- '^ted. 

Scorin^: No inforrrtation provided. 

I ... 
Inieinai Consistency: No intbrmation provided. 

EVALl'ATIVE REACTIONS ' 

Practicality: The test is administered by teaciier; no training is necessary. Materials are 
proviclcd in the tc^f )ier's manual. 

Validity For Specific Purposes and Populations: Although no validity information is 
' provided, the test may give a reasonable estimate of the child's -primary" language skills 

Rp>' ibility: Inadequate infdrnlation to judge: 

Ov-raK Adequacy: These activities would be helpful to teachers working with EtigllsH as 
a second language ci.ilHren. but they do not constitute a test of communication skill. There 
is a serious lack of rigorous evaluation. 



MATERIALS REVIEWED: SiWaroli. N. J.; Skinner. J. T.; atid Maynes, Jr.. 
taiW^x^c Evaluatum. Si. Paul. MN; EMC Corponition. 1977. 



OTHER REFERENCES: None. 



RF:VIFWER: Janice P-aiersui! 



3L Profile of NonverSal Sensitivity 



AGE RANGE: Secondary, 

SKILLS TESTED: _ Speaking 

tisiening 
'— interaction 

Visual Encoding 

X Visual Decoding 
^ubskill or Attitude 



COST: Videotape and i6mm fiim not commerciaHy available: 

TIMF r:^:QUIRED FOR ADMINISTRATION: Approximately forty-nve mihules for full 
220 ..'/:m test; short versions are available. 

i 

bE> ^ yjFDON OF TEST, PROCEDURES, ITEMS, SCORiNG: The fail test Inciucics 
items comp >sed of eleven channels. The channels include facial expresMon, body, face 
/ body^ electrohically altered voice (free of language content), and various combinations 
irobf. Stirhuli are presented on filrri or videotape with pauses for idenlifying items and 
...jjonding. Answer sheet is a two, option rriultiple-chdl^^e forrnat. Response options are 
hehjuMoral descriptions such as ''asking forgiveness'' or ''talking about one's divorce.'' 

NORM/C RITERION DATA: Tbe I'^aln standardization group was composed of 491 high 
school students of average intelligence from three geographical region^. Other norming studies 
were based on various adult, internatibrial and impaired populauons 

I 

VALIDITY I 
Predictive: No information provided. 



Cbricurrerit: For high school students, test scores were slightly correlated wifh intelligence 
arid SAT sct)res. Moderate cbrrelaiibris^were foUrid between test scores and various other 
tests of nonverbal decodirig iricludirig Commumcation of Ajfect Receivih^ Ability Test ^nd 
Social fnterpreuitions Test. Zero,? low, or moderate cbrrelatioris erriprged betweeri lest 
.scr re.s and several measures of psycii\>logicaI traits such as dogniijtism and teacher aptitude.; 
T'jst score, ^jrrelated strongly with •other ratings of social competence and sensitivity.' 

I - ■ i 

Content and Item Selection: No iiiformation provided. 



Construct aiitl Other Kthpihcal Studies: Test stimuli accounted for a witie range of 
nonverbal ihlbrrnaiion including movement, full body postwre, ami vpii.v. ' irge number 
of empirical ^tudie;; demonstrated that the instrument can discri^iiinate among various 
occupationul and social categories of subjeeis. Training improved measured nonverbal 
sensitivity. Other studies ibund no ethnic bias, but some cross-culturai bias» favoring 
Americans as opposed to other nationalities. 

•s. 

RELIABILITY 

Alternate Forms: Various short iorrns of the instrument have he n dev^.loped. Accuracy 
was significantly greater on the full film version than'on a still piiotograph version. Low 
or moderate coitelations were found for other shortenjed versions and the ftiii test: 

Test-Retest: For 293 subjects, both adU It arid high schqoj, with second exposure from ten 
'lays to eight weeks after first, pooled reliability was .69. 

Scoring: Not applicrj^ie: 

Internal Consistency: ' Average inter-item correlation was :86; 

EVALUATIVE REACTIONS 

Practicality: The test Is well packaged lor ease of administration and scoring. It requires 
a full class period. 

Validity For Speciilc I'lJf .»oses and Popuiations: Stimuli cover a wid : range of nonverbal 
signals. The rnajbr thrcLis to validity pertain lo ihe test response mode. Students must4je 
able to read behavi: u.» descriptions. Moreover, the meariirig of some of the descHplors 
may be variable or unfamiliar for many students. AVith only two options for each iterii, 
item difficulty may be too low for adequate discrimination in rriariy cases. 

Reiiabiiity: Use of short versions is juestionable, bet reHabinty is otherwise quite respect- 
able for a conlmunication decoding '.ask involving the recognition of afject; 

Overall Adequacy: Ari.ilysis of childreri's Uriderstaridirig of the response terms is necessary 
to interpret test results. The test stimulus arpears to have high ecological validity for the 
- range of nonverbal se^isitivity measllred, 

MATERKALS RFVIEWED: Rosenthal, R:; Hall, j: A., DiMatteo, M. R.; Rogers, P. L.; 
arid / uvity lo Ndrm^rtafOyfrm^^^ Tfiv PONS Test. Baltimore: Johns 

Hop!.: / . 



REVIEWER: Don Rubin 
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32. PRI Reading Systems, Oral Language SkiU 
Ciiisters 



AGE RANGE: Kindergarten-grade 3; 



SKILLS TESTED: _ Speaking 
^ Listening 

Interaction 

Visuai Encoding 

Visual Decoding 
— Subskill or Attitude 

COST: Not specified. 

TIME REQUIRED FOR ADMINISTRATION: Fifty to seventy minutes 

DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test is composed 
of subscales that assess (1) sound segmentation, (2) vocabulary, (3) syntax, (4) literal mean- 
ing, and (5) inferred meaning. The test is basically a phonetic discrimihaiion and listening 
test. There are two levels of the test. Level A covers kindergarten and grade I, Level B 
covers grades 2 and 3. Multiple-choice questions are orally presented by the teacher; 

NORM/CRITERION DATA: No information provided. 



VALIDITY 

Predictive: No information provided: 

Concurrent: No information provided. 

Content and Item Selection: No information provided. 

Construct and Other Empirical Studies: No information provided; 

RELIABILITY 
Alternate Forms: Not applicable. 
Test-Retest: No information provided. 
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Scoring: Not applicable; 

Iiitefhal Consistency: No information provided. 

EVALUATIVE REACTIONS 

Practicality : The test seems suited to the classroom teacher; The success of the test depends 
on appropriate administration of the tey; by the teacher. 

Validity For Specific Purposes and Populations: Inadequate information to judge. 

Reliability: Inadequate information to judge. 

Overall Adequacy: The lest is very limited in what it assesses: Results could be confounded 
with intelligence, teacher delivery, and other constructs. 

MATERIALS REVIEWED: PBl Reading Systems. New York: McGraw Hill, 1980. 

OTHER REFERENCES: None 

REVIEWER: John Daly 
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33. SRA Achievement Series 



AGE RANGE: Kinderganen-grade 3. 



SKILLS TESTED: _ 



Speaking 
Lisiening 



Interaction 
Visual Encoding 
Visual Decoding 

Subskill or Attitude: auditory discrinfiination 



COST: Not specified. 



TIME REQUIRED FOR ADMINISTRATION: Auditory recognition, twenty or twenty- 
five minutes; listening comprehension, twenty-five minutes. 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: There are three levels 
of the test: A (kindergarten-grade I), B (grades 1-2), and C (grades 2-3), Each level has 
two forms. For the auditoiy discrimination test (Levels A-B), the test administrator reads 
two words and the student must answer whether the words are alike or different in one of 
the following ways: (I) beginning sounds, (2) vowels, (3) ending sounds, and (4) beginning 
and ending r«ounds. 

In th'j listening comprehension test (Levels A-C) the student is required to identify the correct 
illustration for a word or situation read aloud by the test administrator. Skills include: 
(1) dentifying a picture specified by oral directions, (2) identifying a picture of a detail in 
a sentence or story given orally, (3) identifying a picture oi the main idea of a sentence or 
story given orally, (4) identifying a picture of a relationship among events in a story given 
orally (such as sequence or cause), and (5) identifying a picture of a conclusion based on 
material given orally. 

NORM/CRITERION DATA: The tests were normed with at least 3,000 students per grade 
per form in the first standardizing. The tests were standardized a second time with a smaller 
group: 

VALIDITY 
Predictive: No information provided. 
Concurrent: No information provided: 
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Content arid Item ISelection: Extensive content validity and item selection procedures 
were implemented. Items were revievved for bias. Statistical tests of bias were conducted: 

Construct and Other Empirica' Studies: No information provided. 

RELIABILITY 

Alternate Forms: No information provided. 
Test-Retest: No infonnation provided. 
Scoring: Not applicable. 

Internal Consistency: Based on data from Form I , average inter-item correjati^ons for the 
auditory discrimination test ranged from .79 to .89. Average inter-item correlations for 
the listening comprehension test ranged from .55 to .80. 

EVAbUATIVE REACTIONS 
Practicality: The tests provide complete, easy to follow direciions: 
Validity For Specific Purposes and Popuiations: Evidence Indicates the test validity is 



Reliability: Evidence indicates the reliability of the auditory discrimination test is good; 
the reliability of the listening comprehension te.st is fair. 

Overall Adequacy: These tests are designed to measure reading readiness. The tests are 
not as good for measures of communication. 

MATERIALS REVIEWED: SRA Achievvment Series, Ciiicago, IL: Science Research As- 
sociates, 1979. 



good. 



OTHER REFERENCES: None. 



REVIEWER: Nancy Mead 
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34. Sequential Tests of Educational Progress; 
Listening 

AGE RANGE: Grades 3-12: 
SKILLS TESTED: ^ Speaking 



i Listening 
— Interaction 

Visual Encoding 

Visual Decoding 

Subskill or Attitude 



eOST: $12. 

TIME REQUIRED FOR ADMINISTRATION: Twenty minutes. 

DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test has six levels; 
one for each of six age groups^ Each level has two forms (X and Y): The teacher reads a 
short passage to students, students then answer multiple-choice questions about the passage. 
Direction following is also included in the test. Teachers read instructions to students who 
work Oil the dictated problem using a worksheet. 

NORM/CRITERION DATA: Specific studies were not summarized. Norm, classification, 
and percentile values are in the process of being derived. 

VALIDITY 
Predictive: No information provided. 
Concurrent: No information provided. 
Content and Item Selection: No information provided: 
Construct and Other Empirical Studies: No information provided. 

RELIABILITY 
Alternate Forms: No information provided. 
test-Retest: No information provided 
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Scoring: Not applicable: 

Internai Consistency: No information provided: 

EVALUATIVE REACTIONS 
Practicality: The test appears to have very good potential. 

Validity For Specific Purposes and Populations: The test meets the specific goals stated 
in the justification. 

Reliability: Inadequate information to judge. 

Overall Adequacy: inadequate inrormation to judge. Studies are likely in progress. This 
test may become a standard listening test: 

MATERIALS REVIEWED: Educational TesUng Service, Sequential Test of Educational 
Progress, Reading, MA: Addison-Wesley, I97q. 

OTHER REFERENCES: None: 

REVIEWER: John Daly 
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35. Stanford Achievemenl Testsf Listening 
Cdiiipreliensieii 

AGE RANGE: Grades 1-6. 

SKILLS TESTED: _ Speaking ; 



Listening 

Interaction 

Visual Encoding 

Visual Decoding 

Subskill or Attitude 



eOST: $20.50 per ihirty-five booklets. 

TIME REQUIRED FOR ADMINISTRATION: Approximately iweniy-five to thirty-five 



DESeRIPTION or TEST, PRdCEbURES, ITEMS, SCORING: A listening compre- 
hension component is included in five levels of the iesi baiiei7: Primary L Primary 11, Primary 
Ili, Intermediate I, and Intermediate II. Each test has two forms (A and B): There are twenty- 
six items in the lowest level and fifty items in all other levels. The student hears a wide 
variety of passages dictated by the administrator and picks the picture or answer, out of four 
choices, that be.st answers a question about the passage: The questions ttieasure central focus, 
specific meanings, implied meanings, perceptions of concepts and relations, and identification 
of inferences. 



NORM/CRITERION DATA: The test was normed on a national sample of students that 
represented various locations and types of communities. Over 225,060 students were in- 
volved. The study included fall and spring administrations. 



VALIDITY 
Predictive: No information provided. 

Concurrient: Correlations between the listening lest and the other tests in the battery and 
with the Otis-Lennon Mental ability were generally high, ranging in the .50s to the .80s, 
indicating tha^all tests are operating in the same general domain. Correlations were about 
as high with math as with reading: 



mmutes. 
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Cdnteht aiid Iteiii Selection: The test was based bri cditimdrily used textbooks and cur- 
riculum objectives, items were reviewed for bias by experts. Standard item analyses were 
performed: 

Construct and Other Empirical Studies: No information provided. 



RELIABILITY 
Alternate Forms: No information provided. 
test-Retest: No Information provided. 
Scoring: Not applicable. 

Internal Consistency: Split-half reliabiiities ranged from :86 and :88 and average inter- 
item correlations ranged from .85 to .89 for three grades about which the reviewer had 
in formation . 



EVALUATIVE REACTIONS 

Practicality: The test is group administered: The listening comprehension items must be 
dictated by the administrator but the instructions are very clear: 

Validity For Specific Purposes aiid Populations: Although not much evidence of validity 
is provided, the test covers the areas cbrhiTionly included in curricula. 

Reliability: The evidence indicates the reliability of the listening cdmpbherit is very good. 

dveraii Adequacy: The lest attempts to cover a variety of listening passages and types of 
questions. Some questions, however, appear to test thinking skills or vocabulary knowledge 
rather than cbmprehenslon. 



MATERIALS REVIEWED: Madden, R.; Gardner, E. F.; Rudman, H. P.; Karlsen, B.; 
and Merwin, J. G. S/^/zj/brc/ Ac/z/Vvme'/i/ Tev/. New York: Harcourt Brace Jovanovich, Inc., 

1973: 



OTHER REFERENCES: None. 



REVIEWER: Nancy Mead 
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36. Staiifdrd Early School Achieveifient Test 

AGE RANGE: Kindergarten-grade j : 



SKILLS TESTED: _ Speaking 



X Listening 
_ Iriteractibh 
— Visual Encoding 
Visual Decoding 
Sabskill or Attitude 



COST: Not specified. 

TIME REQUIRED FOR ADMINISTRATION: Approximately ninety minutes. 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test is designed to 
assess children's cognitive abilities. The test consi.sts of four parts: (I) the environment, 
(2) mathematics, (3) letters, and (4) sounds and aural comprehen.sion: This is a group ad- 
ministered test where children write in answer booklets to^indicate their respon.ses: Subscores 
arc available in each of the four parts: 



NORM/CRITERION DATA: Norms were determined frdrri responses of children from 
twenty-five states. The final norm sample consisted of 8,310 kindergarten and 1 1,|06 first 
graders. Census, ''size of city" data, and intelligence scores were used in selecting these 



VALIDITY 
Predictive: No information provided. 

Concurrent: The te.st correlated .74 with the OttS'tennon Mental Ability Test for H , |06 
first graders. 

Content and Item Selection: Original que.stions were given to 3,!bd first grade children 
in ten school districts. The best iterhs were .selected from those fdrm.s. 

Construct and Other Empirical Studies: No information provided. 

RELIABILITY 
Alternate Forms: Not appiicahie. 



children. 
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Scoring: Not applicable. 

Iriterhal Cbrisistericyi The split-half reiiability coefficients ranged from .76 to .85 For 
kindergarten and .77 to .89 for first grade. 

EVALUATIVE REACTIONS 

Practicality: Children are tested in groups of seven to fifteen. The test may be impractical 
for large scale testing. No special training is necessary; the test may be administered by 
teacher. 

Validity For Specific Purposes and Populations: Inadequate information to judge. 
Reliability: Inadequate information to judge: 

Overall Adequacy: This is not a test of speech communication, rather a test of cognitive 
abilities. Instruction manual includes classroom activities: 

MATERIALS REVIEWED: Madden, R., and Gardner, E. F, Stanford Early Schooi 
AchievemefU TesJ, New York: Harcourt Brace Jovanovich. Inc., 1969. 

OTHER REFERENCES: None. 

REVIEWER: Janice Patterson 
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37. Situational Language Tastes 



AGE RANGE: Grades 1-3. 



SKILLS TESTED: ^ Speaking 
^ Listening 
i Iriteraclioh 
z-^ Visual Encoding 

Visaai Decoding 

— Subskill or Attitude 



COST: User produces materials and scores. 

TIME REQUIRED FOR ADMINISTRATION: Fifteen minutes per session; three sessions. 



DESCRIPTION OF TEST, PROCEDURES, ITEMS* SCORING: The test [s divided into 
three parts. The first section is designed to assess classroom teacher-child and child-child 
interaction. The teacher and the whole class discttss an assortment of common objects. The 
teacher is instructed to a.se these materials to elicit conversation from the children. The 
session is tape recorded and lasts fifteen minutes. In the second section, cdriducted on a 
different day, three children and a teacher discuss cartbbri-like pictures. The teacher shows 
one picture at a time and asks the children a series of structured questions such as, ''What 
is happeriihg in this picture?" In the second phase cf section 2» the teacher shows two cards 
and asks, ''What picture comes first?'* In the final five minute phase of section 2, the teacher 
presents three cards and aslcs, *'What stoiy does this picture tell?'' All conversations are 
tape recorded: The third section of the test occurs immediately following section 2. the 
teacher tells the small group of children that she has work to do but that they may stay arid 
di.scuss the cards. The tape recorder continues to run and the children's resulting conversations 
are recorded for analysis. Transcripts of the children's speech are analyzed in four major 
areas: (1) type-token ratio (ratio of the number of different words to the total number of 
wbrd.s), (2) verb tense diversity, (3) vocabulary diversity, and (4) average number o» words 
per child. 

NORM/CRITERION DATA: No information provided. 

VALIDITY 

Predictive: No information provided: 
Concurrent: No information provided. 
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eontent and Item Selectlbh: The tasks were selected to reflect children's natural language 
in the classroom: 

Construct and Other Empirical Studies: No information provided. 



RELIABILITY 
Alternate Formi: Not applicable. 
Test-Retest: No information provided. 

Scoring: The cdn.sensu.s method of reliability was used for the tape transcriptions. Five 
researchers coded the transcripts; the reliability ranged from 89 percent to 100 percent. 

Internal Consistency: No information provided. 

EVALUATIVE REACTIONS 

Practicality: The te.st i.s not practical for large scale assessment because children work m 
small groups and scoring and analysis are complicated. Resulting data would not be 
available to teachers in a useful form. 

Validity For Specific Purposes and Populations: Inadequate information to judge. 
Reliability: Inadequate information to judge. 

Overall Adequacy: The test is a poor a.ssessment tool for measuring communication ability: 
Although it Tocuscs on children's speech, the serious issues of Validity and reliability are 
anaddressed. It also seems to tap other skills such as sequchcihg in session 2. The scoring 
system is complicated and cumbersome for teachers. 

MATERIALS REVIEWED: The University of Arizona, College of Edacation: t/;ve' e;/ 5^^^^ 
lumona! Lah^ua^e Tasks In an Intra'TEEM and TEEM Verms Comparison Evaimtivm 
Tucson, AZ: author, 1976. 



OTHER REFERENCES: None 



REVIEWER: Janice Patterson 
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38. Speech in the Classsroditi: Assessment 
Instruments of Speaking Skills 



AGE RANGE: Grades 1-12; 



SKILLS TESTED: X. Speaking 



— Listening 
Interaction 

Visual Encoding 

Visual Decoding 

X Subskill or Attitude: speaking experience and attitudes 



COST: Not specified. 

TIME REQUIRED FOR ADMINISTRATION: Assessment of Speaking, approximately 
five minutes; Inventory of Experiences, approximately five minutes; and Summary of At- 
titudes, approximately five minutes; 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test consists of 
three parts. The first is ah assessment of speaking skills. This is a bne-bn-bne assessment 
of the student with the administrator. The student chooses a picture to tell a story about and 
then performance is rated on a l~4 point scale; The second part is an inventory of classroom 
speaking experiences. This is a paper and pencil test; The test has two levels, one for grades 
1-6 and one for grades 4-12. The test asks students fifteen or twenty-five questions about 
speaking experience in the ciassroom. The test asks the teacher fifteen to twenty-five related 
questions about speaking activities ir^ the classroom. The third partes a survey of attitudes 
toward classroom speech situations. This is a paper and pencil test. The test has two levels, 
one for grades 1-6 and one for grades 4-12. The test asks the student twelve or twenty 
questions about attitudes toward self speaking in class. 

NORM/CRITERION DATA: No information provided. 

VALIDITY 



Predictive: No information provided. 

Concurrent: No information provided; 

Content aild Item Sejectidii: No information provided. 

Construct and Other Empirical Studies: No information prctvided. 
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RELIABILITY 
Alternate Forms: Not applicable. 
Test-Retest: No information provided. 
Scoring: No information provided: 
Ihteriial Consistency: No ihformatibh provided. 

EVAfcbAtlVE REAeTIONS 

Practicality: The assessment of speaking skills is feasible but the test requires one-on-one 
assessment and trained scorers. The other tests are easy to a/niinister. 

Validity For Specinc Purposes arid Pdptilatidris: Inadequate information to judge. 

Reliability: inadequate information to judge. 

bveraii Adequacy: The test is in the developmental stages; It needs more testing and 
ddcumehtatibh. 

MATERIALS REVIEWED: Kozpol, S., and Cercone, K. Speech in the Classroom: As- 
sessmefU instrumehts, Harrlsburg, PA: Pennsylvania Department of Education, 1980. 

OTHER REFERENCES: None: 

REVIEWER: Nancy Mead 



39. Test ef Adoleseent Language 



AGE RANGE: Ages 11-18. 

SKILLS TESTED: Speaking 
X Listening 

Interactmri 

Visual Encoding 

— Visual Decoding 

— SubskiU or Attitude 

COST: Approximately $75. 

TIME REQUIRED FOR ADMINISTRATION: One to three hours (test is open ended), 

DESCRIPTION OF TEST, PROCEDURES, itEMS^ SCORING: The test is designed to 
identify language proficiency, to identify strengths and weaknesses in various dimensions 
of language, and to measure progress. It includes eight substates covering reading, writing, 
speaking, and listening. Four are related to speaking and listening. For Listening/ Vocabulary, 
a word is read aloud and students identify two pictures which relate to the word. For Listening/ 
Grammar, three sentences are read aloud and students identify two that express the same 
thought. For Speaking/Vocabuiary, a word is read aloud and the student uses the word 
correctly in a meaningful sentence. For Speaking/Grammar, a sentence is read aloud and 
students repeat it aloud, 

NORM/CRITERION DATA: The test was normed on 2,723 students in seventeen states 
between ages II and IB: 

VALIDITY 
Predictive: No informatibri provided. 

Concurrent: With thirty-two subjects, total composite listening correlated .51 with Pea- 
body Picture Vocabulary Test. 

Total composite speaking correlated .60 with memory for related syllables from Detroit 
Tests of Learning Aptitude. 

Content and Item Selectlbh: Test developers used standard procedures to identify item 
difficulty and discrimination power. 
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Construct arid Other Empirical Studies: Various studies supported hypotheses of age 
differentiation, subtest interrelationship, group differentiation, and relationship with tests 
of intelligence. 



RELIABILITY 
Alternate Forms: Not appiicabie. 

Test-Retest: Correlations ranged from ,74 to .85, with higher correlations for composites. 
Scoring: Inter-rater reliability for speaking/vocabulary was .96. 

internal Consistency: Average inter-item correlations for subtests ranged from .60 to .90 
with higher concentrations for composites. 

EVALUATIVE REACTIONS 

Practicality: the speaking tests require individual administration and trained scorere but 
are straightforward. Listening tests are easy to administer. 

Validity For Specific Purposes and Populations: Evidence indicates test validity is very 



Reliability: Evidence indicates test reliability is very good, 
dveraii Adequacy: The measures focus on very narrow subskills. 

MATERIALS REVIEWED: Hammill, D. D_,;^Brpwn,y, L,;_Larsen, S. C.; and Widerhoit, 
J. L. Test of Adolescent Language. Austin, TX: PRO-ED, 1980. 

OTHER REFERENCES: None: 

REVIEWER: Nancy Mead 



good. 
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40. Test of Listening Accuracy ta Cliildreii 



AGE RANGE: Kindergarten-grade 6. 

SKILLS TESTED: — Speaking 
JL Listening 

— Interaction 

— Visual Encoding 
JL Visual Decoding 

— Subskill or Attitude 



COST: Not specified. 

TIME REQUrRED FOR ADMINISTRATION: Not specified. 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: Two tests are available. 
One is for indivJiual testing and the other for group testing. Children are presented with 
picture pairs and then hear one of the pictures named. They are to identify the appropriate 
picture. 



NORM/CRlTERIdN DATA: Some norms are reported and qualitative classes (average/ 
superior, etc.) are provided, but no research base for the categorical assignments is provided. 

VALIDITY 
Predictive: No information provided. 
Concurrent: No information provided. 
Content and Item Selection: No information provided. 
Construct and Other Empirical Studies: No information provided. 
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RELiABItlTY 



Aiternate Forms: Not applicable. 
Test-Retest: No information provided. 
Scoring: Not applicable. 

Internal Consistency: internal consistency correlations ranged from .75 to .95 depending 
on groups and scales; 

EVALUATIVE REACTIONS 

Practicality: The ability to use this test with either groups or individuals makes it attractive. 
It seems usable by classroom teachers. 

Validity For Specific Purposes and Populations: Inadequate Information to judge. 

Reliability: Evidence indicate test reliability is good. 

Overall Adequacy: The test may be useful for teachers but suffers from insufficient nor- 
mative, empirical, and user-oriented data. 

MATERIALS REVIEWED: Mecham, M. J.; Jex, J. L.; and iones, i. b. Test cf Listening 
Accuracy in Children, Manuals for Individual and Group Testing. Salt Lake City: Com- 
munication Associates, Inc. 



OTHER REFERENCES: None 



REVIEWER: John Daly 




41. Torrance Tests of Creative thinkuig; Verbal 
Test 



AGE RANGE: Kindergarten-grade 3: 



SKILLS TESTED: X Speaking 

Listening 

— Interaction 

Visual Encoding 

Visaal Decoding 

Subskill or Attitude 



COST: $8.50 for twenty-five bookiets; Si. 45 for scoring each student. 



TIME REQUIRED FOR ADMINISTRATION: Approximately forty-five minutes 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The verbal test mea^ 
sures creative thinkmg usmg words. The test has two forms (A and B), For students from 
kindergarten through grade 3, the test is administered individually and students give responses 
orally. At older ages the test is group administered and individuals respond in writing. The 
test includes seven tasks which require creative responses, for example, listing possible 
causes for events shown in a picture. Responses are evaluated in terms of fluency, originality 
and in some cases flexibility. The fluency score is primarily the number of relevant responses. 
An optional scoring for elaboration is mentioned but no scoring guides are provided. 

NORM/CRITERION DATA: Some norm data are nrovided but the technical manual was 
not available for review. 



VALIDITY 

Predictive: Some predictive data are provided but the technical manual was not available 
for review. 

Concurrent: No information available. 

Content and Item Selection: No information available. 

Construct and Other Empirical Studies: No information available. 
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RELIABILITY 
Alternate Forms: No informatibh available. 
Test-Retest: Mo information available. 

Scoring: Correlation between scores of trained and untrained scores ranged from .86 to 
.96. More information is available in the technical manaal; 

Internal Consistency: No information available. 

EVALUATIVE REAeTieNS 

Practicality: The test requires very lengthy individual testing; Maintaining the a_ttent[on 
span of young children might be difficult. Minimal guidance is given for translating tasks 
into language for young children. 

Validity For Specific Purposes and Populations: Inadequate information to judge. 

Reliability: Preliminary evidence indicates high scorer reliability. 

Overall Adequacy: the only measure of oral ability in this test is one of quantity. 

MATERIALS REVIEWED: Torrance, E. P. forrance Tests of Creative Thinking. Bensen- 
ville. It: Scholastic Testing Service, 1966. 

OTHER REFERENCES: None. 

REVIEWER: Nancy Mcu.^ 



42. Utah Test of Language Develdpinent (Direct 
Test Version) 



AGE RANGE: Preschool-grade 9 (ages 2-14). 



SKILLS TESTED: 



Speaking 



Listening 
Interaction 

— Visual Encoding 

— Visual Decoding 

— SabskiU or Attitude 



COST: Not specified. 



TIME REQUIRED F0R ADMINISTRATION: Twenty to forty minutes. 



DEiSCRiPtlON OF TEST, PROCEDURES, ITEMS, SCORING: The test is des^lgned to 
assess language production and comprehension skills. Test items include responding to 
instructions, naming objects, repeating digital span (forewards and backwards). Indicating 
receptive vocabulary, drawing simple shapes, writing numbers and letters (both manuscript 
and cursive), telling a story, and reading on a primer level. Scoring is for total test performance 
and does not provide subtest scores. Two forms are available: the direct-test version arid the 
informant-interview version (available through the American Guidance Association). 



NORM/CRITERION DATA: Norms are based on 393 children in twenty-three states. These 
data were combined with a Utah sample of 273 children, judged representative of a normal 
population, tater, data were collected on 989 kindergarten children including minorities. 
The norms provide language-age equivalents. 



VALIDITY 
Predictive: No information provided. 

eoncurrent: The test correlated :72 and .81 with the Verbal Language D^^^ 

.53 with the Mean Length of Utterances, and .87 and .91 with the Illinois Test ofPsy- 
choiinguistic Abilities. 

Cohterit and Item Selection: Items were selected from standard sources; 

Construct and Other Empirical Studies: No Inforrnatlon provided. 
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RELIABILITY 



Alternate Forms: The two forms, direct test and the ihformaht interview form, correlated 
.81 with a time interval of approximately three weeks. 

Test-Retest: See above. 

Scoring: No information provided. 

Internal Consistency: Split-half correlation was .94. 

EVALUATIVE REACTIONS 

Practicality: The ^est is administered individually, thus, it is riot practical for large scale 
testing. No specific trairiing is needed. 

Validity For Specific Purposes and Populations: The stated purpose of assessing broad 
language skitis is violated by inclusion of items on reading and small motor skills (writing 
and drawing) and items requiring memory of a digital span (often seen on intelligence 
tests). 

Reliability: Evidence indicates test reliability is very good. 

Overall Adequacy: The test is a poor measure of communication because many items call 
for proficiencies other than language ability. 



MATERIALS REVIEWED: Mecham, M. J., and Jones, J. D. Utah Test of Language 
Pevelopment Manual of Directions. Salt Lake City: CbmriiUnication Research Associates, 
Inc., 1978. 



OTHER REFERENCES: None. 



REVIEWER: Janice Patterson 



43. Vermont Baisic Cdmpetency Pregram 
Speaking and Listening Assessments 



AGE R/v. GE: Kindergarten -grade 12 

SKILLS TESTED: X Speaking 

Listening 
JL Interaction 

— Visual Encoding 

— Visual Decoding 

— Subskiil or Attitude 

(EiSST: No costs beyond record keej5irig. 

TIME REQUIRED FOR ADMINISTRATieN: Highly variable. 

DESCRimeN GF TEST, PROCEDURES, ITEMS, SCORING: Precise Implementation 
of this test IS determined by local districts, arid apparently varies across age levels: Speaking 
tasks include (I) giving directions, (2) extended expository. Informative, or persuasive talk, 
(3) answering telephones and taking messages, (4) using telephones to get information and 
assistance, (5) introducing self and others, (6) interviewing for a job, arid (7) jDarticipatirig 
in jriformal discussion. Listening tasks include (I) following directidris, (2) retelling a nar- 
rative sequence, and (32 summarizi^ factual material. Several of these tasks involve sim- 
ulated tasks, while informal discussion skill is assessed riaturalistically over time. Evaluation 
criteria are not wholly specified, but apparently iriciude accuracy, use of standard English 
organization, clear aiticulatidn, and other functiorially related criteria. 

NORM/CRITERION DATA: The percent of l2-, l4-, and 15-year-dlds passing each com- 
petericy are reported. These results apparently summarize scores df several thousand students 
throughout Vermont. 

VALIDITV 
Predictive: No iriformatibn provided. 
Coricurreht: No information provided. 

Content and item Sejection: The develdjjriient of competencies (tasks) were based on 
input of 1,500 Vermont educators as well as exterisive search of literature. 

Construct and Other Empirical Studies: No information provided. 
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RELIABItlTY 



Alternate Forms: Not applicable. 

Test-Re test: No information provided. 

Scoring: No intbrmatibn provided. 

internal Gonsistency: No Information provided. 



EVALUATIVE REACTIONS 



Practicality: The procedures place the burden of te.stihg and record keeping on teachers. 
It is difficult to check on compliance in implemeritatioh. The test would require massive 
in-service training, with probable beneficiaJ effects. Ideally the procedure.s would include 
a .second rater which would increase cost. 

Validity For Specific Purposes and Populations: The test inclades a good samphng of 
communication siiuatibhs. The lack '^f precisely defined evaluation criteria makes validity 
difficult to judge. If the procedures are closely tied to instruction it may not be a valid 
measure of individual ability. 

9 _ 

Reliability: Use of single classroom teacher rating performance without clearly defined 
guidelines is a major problem. Expectations and bias would likely be major factors. Also, 
procedures make no provision for consistency of administration. 

Overall Adequacy: The program seems well motivated by a concern for functional com- 
munication competence. Use of cbhtextually diver.sis tasks is especially admirable. But 
lack of well defined procedures compromise the value of results. 

MATERIALS REVIEWED: Vermont Department of Education. Basic Competencies: A 
Manual of Infpmation and Guidelines for Teachers and Administrators. Vermont's Basic 
Competency Prop-am. 197H-1979 Report. 1979^mO Report. Montpelier: author, 1978-- 
79, 1979-80. 



OTHER REf ERENeES: None 



REVIEWER: Don Rubin 
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44. Wallner Test of Listening Comprehension 



AGE RANGE: Kindergartcn-grade j: 



SKILLS TESTED: _ Speaking 



JL Lisleriirig 

Interaction 

Visual Encoding 
— Visual Decoding 
_ Subskill or Attitude 



COST: Not commercially available. 



TIME REQUIRED FOR ADMINISTRATION: Not specified. 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: The test has two forms 
( A and B), Each form consists of six psssa^^^^ Each passage is followed 

by seven literal comprehension and seven inferential questions. The test administrator reads 
the passages and questions aloud: Students pick the picture, from three choices, that best 
answer the question. 



NORM/CRITERION DATA: No information provided. 



VALIDITY 

Predictive: One year after taking the test, 107 students were given the Stanford Achieve- 
ment Test, Primary I Battery, Form W, Correlations between Forms A and B and the 
Stanford test was .68 and .64 respectively. 

Cdiicurfenti Correlations between forms A and B and the Listening Subtest of the Met- 
rojjblitan Readiness Tests (Foim B) were :59 and .60 respectively, based dri a sample of 
150 students. Correlations with the Metropolitan Readiness test total score were .74 and 



Content and Item Selection: Passages were cdrripbsed on the basis of the Dale-Chal I and 
Spache Readability formulas. A panel of experts in reading verified the skill placement 
and content validity of the items. 

Construct arid Other Empirical Studies: No information provided. 
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RELIABILITY 

Alternate Forms: The two forms correlated .89 based on a sample of 140 students. 



Test-Retest: No information provided. 
Scoring: Not applicable. 

Iriterrial Gohsistency: Average inter-item correlations were .95 for Form A and .95 for 
Form B, based on a sample of 140 students. 



EVALUATIVE REACTIONS 

Practicality: No information is provided about the length of test administration . However, 
administration and scoring procedures are not complicated. 

Validity For Specific Purposes and Populations: The test seems more closely related to 
general Verbal ability than to listening ability per se. Effects of children's prior knowledge 
about subject rnatter is unknown. 

Reliability: Variation in administration may adversely affect reliability. The test appears 
highly reliable in other respects. 

dveraii Adequacy: The test adopts a non-interactive definition of listening comprehension 
based on extended written prose rather than oral language stimuli. 

MATERIALS REVIEWED: Wallner, N. K. 'The Development of a Listening Cbmpre- 
hension Test for Kindergarten and Beginning First Grade." Educational and Psychological 
Measuremem 34( 1 974):39 1 -396. 



OTHER REFERENCES: None. 



REVIEWER: Don Rubin 
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45. Weslsiie High School Minimum 
Test 




AGE RANGE: Grade 10. 



SKILLS TESTED: X_ Speaking 



Listening 

Ihteraccion 

_ Visual Encoding 
— Visual Decoding 
^ Subskill or Attitude 



COST: Only record keeping costs. 



TIME REQUIRED FOR ADMINISTRATION: Not specified. 



DESCRIPTION OF TEST, PROCEDURES, ITEMS, SCORING: Students choose a topic 
and individually present an extended discourse to a group. A planning sheet emphasizing 
puipose, development, and organization is provided. The student is encouraged to reheai^e 
the talk prior to testing. Dichotomously scored criteria include ( 1 ) introduction, (2) supporting 
material, (3) conclusion, (4) language including grammar and word choice improprieties, 
(5) vojume, (6) eye contact, and (7) response to questions. Students must demonstrate mas- 
tery on ail criteria for a passing score. 



NORM/CRITERION DATA: No inforrnation provided. 



VALIDITY 
Predictive: No information provided. 
Concurrent: No information provided: 
Cbhteht and Item Selection: No information provided. 
Construct and Other Empirical Studies: No information provided. 
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RELIABILITY 
Alternate Forms: Not applicable. 
test-Retest: No information provided. 
Scoring: No information provided. 
Internal eonsistency: No information provided. 

EVALUATIVE REACTIONS 

Practicality: The test is easily administered and scored. 

Validity For Specific Purposes arid Populations? the test measures a very narrow range 
of situations. The criteria are more formal than functional. Emphasis oh language **errdrs" 
may bias the test against speakers of nonstandard dialects. 

Reliability: Use of a single observer calls reHabinty into question. Allowing students free 
choice of topic may also introduce measurement error. 

everall Adequacy: The test fails to sample a spectrum of communication competencies. 
No provision is made for simulating a communicative context. Criteria emphasizes me- 
chanical aspects of public speaking. 

MATERIALS REVIEWED: Westside Community Schools. Minimum €owpetency Packet^ 
Omaha, NE: author, 1979. 

OTHER REFERENeES: None. 

REVIEWER: Don Rubin 
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Appendix A 

Standards for Effective (9rai Cbmnmnication 
Programs 



Prepared hy: American Speech-I^riguage-Hea 
Association and Speech Communication 
Association 

Adequate oral communication frequently determines an 
indlviduars educational, social, arid vocational success. 
Yet, American education has typically neglected formal 
instruct ion in the basic skills of speaking and listening. 
It is important that state and local education agencies 
implement the most effective oral communication pro- 
grams possible. 

The following standards for oral commanieation were 
developed by representatives of the Speech Communi- 
cation Association and the American Speech-Language- 
Hearing Association. 

If effective oral communication programs arc going 
to be developed, all components of the recommended 
standards must be consldercd. Implementation of these 
standards will facilitate development of adequate and 
appropriate oral communication necessary for educa- 
tional, social, and vocational success. 

Definition 

Oral Communication: the process of interacting through 

heard and spoken messages in a variety of situations. 
Effective oral conimunicatlon is a learned behavior, 

involving the following processes: 

I : Speaking In a variety of educatiohal and social sit- 
uations: Speaking involves, but is not limited to, 
arranging ajid producing messages through the use 
of voice, articulation, vocabulary, syntax, and non- 
verbal cues (e.g., gesturc, facial expression, vocal 
cues) appmpriatc to the speaker and listencni. 

2. Listen ing^ in a variety of educational and social sit- 
uations: Listening involves, but is not limited to, 
hearing, perceiving, discriminating, interpreting, 
synthesizing, evaluating, organizing, and remem* 
boring iiifdrmation from verbal and nonverbal mes- 
sages. 

Basic Assumptions 

I : Oral cbmmutiicatibh behaviors of students can be 
_ improved through direct instruction. 

2. Oral cbmniunication instruction emphasizes the In- 
teractive nature of speaking and listening. 

3. Oral communication instruction addresses the every- 



day communicatiori needs of students and includes 
emphasis on the_ classroom as a practical commu- 
nication environment. 
A, There is a wide range of communication competence 
among speakei^ of the same language; 

5. Comrnunication. competence is not dependent upon 
use of^a jiarticular form of language. 

6. A primary goal of oral commanieation instruction is 
to increase the students* repertoire and use of effec- 
tive speaking and listening behaviors. 

7. Oral communication programs provide instruction 
based on a cobrditiated developmental continuum of 
sjcills, preschool through adalt; 

8. Oral communication skills can be enhanced by using 
parents, supportive personnel, and appropriate in- 
structional technology. 

An EffecUve Comintihicatiori Program 
Has the Following Characteristics: 

teaching/Leamitig 

I • T^e oral communicatiori program is based on current 
theory and research in speech and language devel- 
opment, psycholinguisties, rhetorical and commu- 
"*^^*J^" *|icoi7, communication di.sorder5, speech 

_ science, and related fields of study. 

2. Oral commuriication instruction is a clearly identi- 
fiable part of the curriculum. 

3. Oral communication instruction is systematically re- 
lated to reading and writing instruction and to In- 
struction in the various content areas. 

4. The relevant academic, jKrsonal, and social expe- 
riences of students provide core subject matter for 
the oral communication program. 

5. pra[cornmunicatibn instruction provides a wide rangie 
of speaking and listening experience, in order to de- 
velop effective comrnunieation skijis appropriate to: 

a. a rarlge of situations; e.g., informal to formal, 
interpersonal to mass communication. 

b. a range of purposes; e.g.. Informing, learning, 
persuading, evaluating messages, jacihtating so- 
cial interaction, sharing feelings, imaginative and 
creative expression. 

c. a range of audiences; e.g.. classmates, teachers, 
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peers, cmpliiycrs, lahiily, conimunily; 

d: a range of commonlcation forms; e.g., conver- 
sation, group discussion. Interview, drama, de- 
bate, public speaking, oral interpretation. 

c: a range i)f speaking styles; Impromptu, cxtcm- 
p*>raneous, and reading From manus^^^^^^ 

6. The oral communication program provides class time 
for sysiematic jnstructu)n In oral ci)mniunlcation skills, 
e.g., eritieal listening, selecting, arranging; and pre- 
Kenting messages, giving and receiving constructive 
fcedbaek, nonverbal eommunlcatlon, etc. 

7, The oral communiealion program includes devel- 
opment of adequate and appropriate language, artlc- 
uiation, voice, tluency and listening skills necessary 
for Success in cdccmional, career, and social situa- 
tions through regular classroom Instruction, ci)cur- 
rlcular aetivit|e.s, and speech-language pathology and 
aadiology services. 

«. Oral eoinmunkatlon program iri.strucilonencouragc.s 
and provides appropriate opportunities for the rcti- 
ecnt student (e.g., one who is excessively fearful |n 
speaking situations), to partielp?tc more effectively 
in oral communication. 

Support 

1. Oral communication Instruction is provided by in- 
dividuals adequately trained In oral communication 
and/or communication disorders, as evidenced by 
appropriate ecrtlftcationi 

2: Individuals responsible for oral communication In- 
.struetlon receive continuing educatibri^on ihcorlcs, 
research, and instrticiion relevant to communication. 

3: Individuals responsible for oral communication in- 
.stniction participate actively in convetition.s, meet- 
ings, publicalion.s, and other activities of 
communication professionals. 

5. The oral eommunicatlon program includes a system 
f()r training classroom teachers to identify and refer 
students who do not have aticquate listening and 



speaking skills, or are reticent, to those qualified 
individuals who can best meet the needs of the stu- 
dent through farther assessment and/or instruction. 
5. Tcachere in all curriculum areas receive information 
on appropriate methods for: (a) Using oral commu- 
nication to facilitate instruction, and (bj using the 
subject matter to improve students' oral communi- 
cation skills. 

6; Parent and community groups are infomied about 
and provided with appropriat^materials for effective 
involvement in the oral communication program. 

7. the oral communication program is n»cilitated by 
availability ard use of appropriate instructional ma- 
terials, equipment, and facilities. 

Assessment and Evaiuation 

1 . The oral communication program is bai«d on a sdioo^^ 
wide assessment of the speaking and listening needs 
of students. 

2. Speaking and listening needs of students will be de- 
termined by qualified personnel utilizing appropriate 
evaluation tools for the skills to be assessed, and 
educational levels of students being assessed. 

3. Evaluation of student progress in oral communication 
is based upon a variety of data including observa- 
tions, self-evaluations, listenere' responses to mes- 
sages, and formal tests. 

4. Evaluation of students' oral communication encoor- 
ages, rather than discourages, students' desires to 
communicate by emphasizing those behaviors which 
students can improve, thus enhancing their ability to 
do so. _ 

5. Evaluation of the total oral communication program 
is based on achievement oif acceptable levels of oral 
communication skill determined by continuous mon- 
itorlrig of student progress in speaking and listening, 
use of standardized and criterion-referenced tests, 
audience-based rating scales, and other appropriate 
instruments: 



Appendix B 

Criteria for Evalaa^^ng Instruments and 
Procedures for Assessing Speaking and Listening 



The following criteria may be applied to published and 
unpublished in.strumcnts and pr(Kedures for asses.sing 
.speaking and listening skills of children and adults. The 
criteria are organized around (a) content consideration.";, 
which deal primarily with the suhstances Of speaking 
and listening instruments and procedures, and (b) technical 
conslcleralions, which deal with such matters as relia- 
bility. Validity, and information on administration: 

1 . Stimulus materials should require the Individual being 
tested to demonstrate skill as a speaker or listener. 

2 . Assc:ssment instruments and procedures should cleariy 
distinguish speaking and listening performance from 
reading and writing ability; i.e.. inferences of 



speaking and listening competence .should not be 
made from tests of reading and writing, and direc- 
tions and responses for speaking and/or listening 
tests should not be mediated through reading and 
writing modes. 

3. Assessment instruments and pr(x:cdures should be 
free of sexual, cultural, racial, and ethnic content 
and/or stereotyping. 

4. Assessment should confirm the presence or absence 
of skills, not diagnose reasons why individuals dem- 
onstrate or fail to demonstrate those sicills. 

5. A.ssessment :should emphasize the application.of 
speaking and listening skills that relate to familiar 
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situations; i.e., stimulus niatcriajs should refer \o 
situations recognizable to the IndividuaJ being tested 
and should facilitate dcmonstratibh of skills rather 
than demonsiratign of conteiu mastery: 
^' ^^^^^'^^L^^^^^^M^^ ^^''^^ ^hat are important for 
various communication settings {c:g., interper- 
sonal, small group, public, and.mass communica- 
tion settings) rather ihan^be limited to one setting. 

7. Assessment should permit a range of mceptabte 
resfwriscs, where such a rangt is appropriate. 

8. Assessment should demonstrate that outcomes are 
more than just chance evidence; i.e., assessment 
should be reliable, 

9. Assessment should provide results that are cdhsis- 
terit with other evidence that might be available. 

10. Assessment should haVe content validity. 

1 1 . Assessment procedures should be standardized and 
detailed enough so that individual responses will 



riot be affected by the administrator's skiils in ad- 
ministering the procedures. 

12. Assessmcni procedures should approximate the fcc- 
^^I'^l^^'Lilr^Lss level of oral communication; they 

_ should not increase or eliminate it, 

13. Assessment pnKcdures should be practical in terms 
of cost and time. 

14. Assessment should involve simple equipment. 
^^^^""I'^nt should be suitable for the developmen- 
tal level of the individual being tested. 

J?^^^'°P^ *'y Kenneth L. Brown, 

Joanne Gurry, and Fred E. Jandt acting as a subgroup 
9^ the Speech Communication Association's Educa- 
tional Policies Board Task Force on Assessment and 
Testing . Approved and endorsed by the Educational Pol- 
icies Board and the Administrative Committee of the 
Speech Communication Association. 
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